23. Deep NLP 2

RNN review, bi-directional RNNs, LSTM & GRU cells. ## Resources
- Overview Articles:
** Unreasonable Effectiveness of RNNs (http://karpathy.github.io/2015/05/21/rnn-effectiveness/) `article:easy`
** Deep Learning, NLP, and Representations (http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/) `article:medium`
** Understanding LSTM Networks (http://colah.github.io/posts/2015-08-Understanding-LSTMs/) `article:medium`
- Stanford cs224n: Deep NLP (https://www.youtube.com/playlist?list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6) `course:medium` (replaces cs224d)
- TensorFlow Tutorials (https://www.tensorflow.org/tutorials/word2vec) `tutorial:medium` (start at Word2Vec + next 2 pages)
- The usual DL resources (pick one):
** Deep Learning Book (http://amzn.to/2tXgCiT) (Free HTML version (http://www.deeplearningbook.org/)) `book:hard` comprehensive DL bible; highly mathematical
** Fast.ai (http://course.fast.ai/) `course:medium` practical DL for coders
** Neural Networks and Deep Learning (http://neuralnetworksanddeeplearning.com/) `book:medium` shorter online "book"
## Episode RNN Review
** Vanilla: When words + running context is sufficient.
** POS, NER, stocks, weather
** Bidirectional RNN (BiLSTM): When stuff from right helps too
** Encoder/decoder or Seq2seq: When you should hear everything first / spin a different way
** Classification, sentiment, translation
** Now w/ word embeddings Train: backprop through time
** Vanishing/exploding gradient LSTMs (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
** ReLU vs Sigmoid vs TanH (Nonlinearities future episode)
** Forget gate layer
** Input gate layer: decides which values to update
** Tanh layer: creates new candidate values
** Output layer

Popout Listen on iPhoneListen on Android