DYNET
1701.03980
ABS
cpp backend
lw graph representation
Problems
- easier debugging / maintaining large proj.
- express naturally
Static vs Dynamic
static
pre-written model transfered to computation graph
graph can be well optimized
cannot deal with variable input size. NLP RNN
cannot deal with variable input structure. Tree NN, Graph NN
hard for complex flow-contronl logic (interface design)
nontrivial interface.
debug difficulty during execution
dynamic
do computation on the fly
can be expensiveo flow control and variable sized inputs are in host language (python?)
Dynet
lazy execution. explicitly ask for it. value()
method
Model, parameter, trainer, expression, operations
examples
- Tree encode nn
- Dynamic control flow. Break when some score is high enough in serial model
Computation Graph (CG)
Node
variable. (Some intermediate result)
maintain a list of incoming node, one incoming function
C level implementations
manual mem manager
cython
eigen library
UI
RNN builder (compare to PyTorch Modules)
Optimizations
sparse update
may be negligible on GPU
not work under some specific optimizers
Minibatch
another way of UI design
Parallelism
CPU data parallel ??????
shared memory PS mode???
BM results
CPU
faster on LSTM. Variable length vs Monolength
sparse works well ???
GPU
weak
vegetable
TODO
- Multi-device
- CG Optim
- More support for Ops and Optims
Inspriations
- Variable CG is important to outperform TensorFlow
- UI is simple to to fake
- Python’s overhead is not significant in expression and even graph computing