深度学习之时间序列预测

深度学习(英语:deep learning)是机器学习拉出的分支,它试图使用包含复杂结构或由多重非线性变换构成的多个处理层对数据进行高层抽象的算法。深度学习是机器学习中一种基于对数据进行表征学习的方法。观测值(例如一幅图像)可以使用多种方式来表示,如每个像素强度值的向量,或者更抽象地表示成一系列边、特定形状的区域。而使用某些特定的表示方法更容易从实例中学习任务(例如,人脸识别或面部表情识别)。

深度学习的好处是用非监督式半监督式(英语:Semi-supervised learning)的特征学习和分层特征提取高效算法来替代手工获取特征(英语:Feature (machine learning))。

表征学习的目标是寻求更好的表示方法并创建更好的模型来从大规模未标记数据中学习这些表示方法。表达方式类似神经科学的进步,并松散地创建在类似神经系统中的信息处理和通信模式的理解上,如神经编码,试图定义拉动神经元的反应之间的关系以及大脑中的神经元的电活动之间的关系。

至今已有数种深度学习框架,如深度神经网络卷积神经网络深度置信网络(英语:Deep belief network)和递归神经网络已被应用计算机视觉语音识别自然语言处理、音频识别与生物信息学等领域并获取了极好的效果。另外,“深度学习”已成为类似术语,或者说是神经网络的品牌重塑。

I. 简介

II. 时间序列预测

You can do time-series prediction with neural nets, but it can get pretty tricky.

  1. The obvious choice is a recurrent neural network (RNN). However, these can be really difficult to train, and I would not recommend RNNs if this is your first time using neural nets. Recently there has been some interesting work on easing the training of RNNs (e.g. Hessian-free optimization), but again - it's probably not for beginners ;-) Alternatively, you could try a scheme where you use a standard neural net (i.e. not a RNN), and try to predict the next frame of data from the previous? That might work.

  2. This question is too general, there is no categorical right answer. Yes, you can use unsupervised feature learning as part of your solution (e.g. pre-training your model), but if your end goal is time-series prediction you will need to do some supervised learning too.

There has been some work on adapting deep learning methods for sequential data. A lot of this work has focused on developing "modules" which can be stacked in a way analogous to stacking restricted boltzmann machines (RBMs) or autoencoders to form a deep neural network. I'll highlight a few below:

  • Conditional RBMs: Probably one of the most successful applications of deep learning for time series. Taylor develops a RBM like model that adds temporal interactions between visible units and apply it to modeling motion capture data. Essentially you end up with something like a linear dynamical system with some non-linearity added by the hidden units.
  • Temporal RBMs: In his thesis (section 3) Ilya Sutskever develops several RBM like models with temporal interactions between units. He also presents some interesting results showing training recurrent neural networks with SGD can perform as well or better than more complex methods, like Martens' Hessian-free algorithm, using good initialization and a slightly modified equation for momentum.
  • Recursive Autoencoders: Lastly I'll mention the work of Richard Socher on using recursive autoencoders for parsing. Although this isn't time series, it is definitely related.