Reinforcement learning card game

History of played cards (40 one-hot-encoded features?).
Why not use the millions of dollars of R D that Google put into TensorFlow to work how to play wappy card game here?
Build the model, specify the loss function to optimize over.When the players have 0 cards in hand, another 4 cards are dealt to each one untile the deck has 0 cards left where then we decide who won based on the number of cards eaten/won.The convenience of TensorFlow is to be able to use declarative means to set up the neural network, rather than having to program them functionally.Random_normal(52, 52, stddev0.35 name"weights the Model.

It would have to play lots and lots of games against itself.
Gin Rummy Training Data, the training data I used was the pairings of 1) the players hand (a 4 x 13 matrix) and 2) the resulting hand evaluation (also a 4 x 13 matrix).
Input, Output, and Parameters for TensorFlow.And computer scientists are lazy.Should the features be represented as one free lotto lucky dip a (120 multiplication bingo game ks2 vector or as a 3x40 matrix?The loss measurement is a sum of squares.In my hand-built system, this is calculated using the convolutional matrix above.How did we go from a 4 x 13 matrix representing the hand, described in the last post, to a 1 x 52 matrix here?Wouldnt it be better to use machine learning to figure it out for me?# Loss loss - d) # Optimizer optimizer adientDescentOptimizer(0.0001) train optimizer.About the game, it's a fairly simple 2-players game.I am wondering wheter this is too much for the NN or wheter my input representation would be bad for the NN?