DYNET

1701.03980

ABS

cpp backend

lw graph representation

Problems

  • easier debugging / maintaining large proj.
  • express naturally

Static vs Dynamic

static

pre-written model transfered to computation graph

graph can be well optimized

cannot deal with variable input size. NLP RNN

cannot deal with variable input structure. Tree NN, Graph NN

hard for complex flow-contronl logic (interface design)

nontrivial interface.

debug difficulty during execution

dynamic

do computation on the fly

can be expensiveo flow control and variable sized inputs are in host language (python?)

Dynet

lazy execution. explicitly ask for it. value() method

Model, parameter, trainer, expression, operations

examples

  • Tree encode nn
  • Dynamic control flow. Break when some score is high enough in serial model

Computation Graph (CG)

Node

variable. (Some intermediate result)

maintain a list of incoming node, one incoming function

C level implementations

manual mem manager

cython

eigen library

UI

RNN builder (compare to PyTorch Modules)

Optimizations

sparse update

may be negligible on GPU

not work under some specific optimizers

Minibatch

another way of UI design

Parallelism

CPU data parallel ??????

shared memory PS mode???

BM results

CPU

faster on LSTM. Variable length vs Monolength

sparse works well ???

GPU

weak

vegetable

TODO

  • Multi-device
  • CG Optim
  • More support for Ops and Optims

Inspriations

  • Variable CG is important to outperform TensorFlow
  • UI is simple to to fake
  • Python’s overhead is not significant in expression and even graph computing