# Introduction To Deep Learning

## What Is Deep Learning And How Can I Study It?

Tyler Elliot Bettilyon BlockedUnblockFollowFollowing Jun 13, 2018

BlockedUnblockFollow继2018年6月13日之后

*I took a Deep Learning course through*

*The Bradfield School of Computer Science*

*in June. This series is a journal about what I learned in class, and what I've learned since.*

*This is the first article in this series, and is is about the recommended preparation for the Deep Learning course and what we learned in the first class.*

*Read the second article here*

*, and*

*the third here*

*.*

Although normally the "prework" comes before the introduction, I'm going to give the 30,000 foot view of the fields of artificial intelligence, machine learning, and deep learning at the top. I have found that this context can really help us understand why the prerequisites seem so broad, and help us study just the essentials. Besides, the history and landscape of artificial intelligence is interesting, so lets dive in!

虽然通常在介绍之前会有"预备工作"，但我将提供3,000英尺的人工智能，机器学习和深度学习领域的视图。我发现这个背景确实可以帮助我们理解为什么先决条件看起来如此广泛，并帮助我们研究必需品。此外，人工智能的历史和景观很有趣，所以让我们潜入！

### Artificial Intelligence, Machine Learning, and Deep Learning

Deep learning is a subset of machine learning. Machine learning is a subset of artificial intelligence. Said another way --- all deep learning algorithms are machine learning algorithms, but many machine learning algorithms do not use deep learning. As a Venn Diagram, it looks like this:

深度学习是机器学习的一个子集。机器学习是人工智能的一个子集。换句话说 - 所有深度学习算法都是机器学习算法，但许多机器学习算法不使用深度学习。作为维恩图，它看起来像这样：

Deep learning refers specifically to a class of algorithm called a

深度学习特指一类称为a的算法neural network, and technically only to "deep" neural networks (more on that in a second). This first neural network was invented in 1949, but back then they weren't very useful. In fact, from the 1970's to the 2010's traditional forms of AI would consistently outperform neural network based models.

，从技术上讲，仅限于"深度"神经网络（更多内容在一秒钟内）。这个第一个神经网络是在1949年发明的，但当时它们并不是很有用。事实上，从1970年代到2010年，传统形式的人工智能将始终优于基于神经网络的模型。

These non-learning types of AI include rule based algorithms (imagine an extremely complex series of if/else blocks); heuristic based AIs such as

这些非学习类型的AI包括基于规则的算法（想象一系列非常复杂的if / else块）;基于启发式的AIs，如A* search; constraint satisfaction algorithms like

;约束满足算法Arc Consistency; tree search algorithms

;树搜索算法such as minimax (used by the famous Deep Blue chess AI); and more.

（由着名的深蓝棋AI使用）;和更多。

There were two things preventing machine learning, and especially deep learning, from being successful. Lack of availability of large datasets and lack of availability of computational power. In 2018 we have exabytes of data, and anyone with an AWS account and a credit card has access to a distributed supercomputer. Because of the new availability of data and computing power, Machine learning --- and especially deep learning --- has taken the AI world by storm.

有两件事阻止机器学习，尤其是深度学习，从而成功。缺乏大型数据集的可用性和缺乏计算能力。在2018年，我们拥有数十亿的数据，拥有AWS账户和信用卡的任何人都可以访问分布式超级计算机。由于数据和计算能力的新可用性，机器学习 - 特别是深度学习 - 已经风靡AI世界。

You should know that there are other categories of machine learning such as

您应该知道还有其他类别的机器学习，例如unsupervised learning and

and reinforcement learning but for the rest of this article, I will be talking about a subset of machine learning called

但是对于本文的其余部分，我将讨论称为机器学习的子集supervised learning.

.

Supervised learning algorithms work by forcing the machine to repeatedly make predictions. Specifically, we ask it to make predictions about data that we (the humans) already know the correct answer for. This is called "labeled data" --- the label is whatever we want the machine to predict.

监督学习算法通过强制机器重复进行预测来工作。具体来说，我们要求它对我们（人类）已经知道正确答案的数据做出预测。这被称为"标记数据"---标签是我们希望机器预测的任何东西。

Here's an example: let's say we wanted to build an algorithm to predict if someone will default on their mortgage. We would need a bunch of examples of people who did and did not default on their mortgages. We will take the relevant data about these people; feed them into the machine learning algorithm; ask it to make a prediction about each person; and after it guesses we tell the machine what the right answer actually was. Based on how right or wrong it was the machine learning algorithm

这是一个例子：假设我们想建立一个算法来预测某人是否违约抵押贷款。我们需要一些例子，说明他们的抵押贷款是否已经违约。我们将采取有关这些人的相关数据;将它们输入机器学习算法;要求它对每个人做出预测;在猜测之后我们告诉机器实际上是什么答案。基于机器学习算法的正确与否 *changes how it makes predictions* .

.

We repeat this process

我们重复这个过程 *many many* times, and through the miracle of mathematics, our machine's predictions get better. The predictions get better relatively slowly though, which is why we need so much data to train these algorithms.

时代，通过数学的奇迹，我们机器的预测变得更好。然而，预测变得相对缓慢，这就是我们需要如此多的数据来训练这些算法的原因。

Machine learning algorithms such as

机器学习算法如linear regression,

, support vector machines, and

, and decision trees all "learn" in different ways, but fundamentally they all apply this same process: make a prediction, receive a correction, and adjust the prediction mechanism based on the correction. At a high level, it's quite similar to how a human learns.

所有这些都以不同的方式"学习"，但从根本上它们都应用了相同的过程：进行预测，接收校正，并根据校正调整预测机制。在很高的层面上，它与人类学习的方式非常相似。

Recall that deep learning is a subset of machine learning which focuses on a specific category of machine learning algorithms called neural networks. Neural networks were originally inspired by the way human brains work --- individual "neurons" receive "signals" from other neurons and in turn send "signals" to other "neurons". Each neuron transforms the incoming "signals" in some way, and eventually an output signal is produced. If everything went well that signal represents a correct prediction!

回想一下，深度学习是机器学习的一个子集，它专注于一种称为神经网络的特定类型的机器学习算法。神经网络最初的灵感来自人类大脑的工作方式 - 个体"神经元"从其他神经元接收"信号"，然后将"信号"发送给其他"神经元"。每个神经元以某种方式转换输入的"信号"，并最终产生输出信号。如果一切顺利，信号代表正确的预测！

This is a helpful mental model, but computers are not biological brains. They do not have neurons, or synapses, or any of the other biological mechanisms that make brains work. Because the biological model breaks down, researchers and scientists instead use graph theory to model neural networks --- instead of describing neural networks as "artificial brains", they describe them as complex graphs with powerful properties.

这是一个有用的心理模型，但计算机不是生物学的大脑。它们没有神经元或突触，或任何使大脑发挥作用的其他生物机制。由于生物模型被破坏，研究人员和科学家反而使用图论来模拟神经网络 - 而不是将神经网络描述为"人工大脑"，他们将它们描述为具有强大属性的复杂图形。

Viewed through the lens of graph theory a neural network is a series of layers of connected nodes; each node represents a "neuron" and each connection represents a "synapse".

从图论的角度来看，神经网络是一系列连接节点;每个节点代表一个"神经元"，每个连接代表一个"突触"。

Different kinds of nets have different kinds of connections. The simplest form of deep learning is a deep neural network. A deep neural network is a graph with a series of fully connected layers. Every node in a particular layer has an edge to every node in the next layer; each of these edges is given a different weight. The whole series of layers is the "brain". It turns out, if the weights on all these edges are set

不同种类的网具有不同种类的连接。最简单的深度学习形式是深度神经网络。深度神经网络是具有一系列完全连接的层的图。特定层中的每个节点都具有到下一层中每个节点的边缘;这些边缘中的每一个都具有不同的重量。整个系列层是"大脑"。事实证明，如果设置了所有这些边缘的权重 *just right* these graphs can do some incredible "thinking".

这些图表可以做一些令人难以置信的"思考"。

Ultimately, the Deep Learning Course will be about how to construct different versions of these graphs; tune the connection weights until the system works; and try to make sure our machine does what we

最终，深度学习课程将是关于如何构建这些图形的不同版本;调整连接权重，直到系统工作;并尝试确保我们的机器做我们的 *think* it's doing. The mechanics that make Deep Learning work, such as gradient descent and backpropagation, combine a lot of ideas from different mathematical disciplines. In order to

它正在做。使深度学习工作的机制，例如梯度下降和反向传播，结合了来自不同数学学科的许多想法。为了 *really understand* neural networks we need some math background.

神经网络我们需要一些数学背景。

### Background Knowledge --- A Little Bit Of Everything

Given how easy to use libraries like PyTorch and TensorFlow are, it's really tempting to say, "you don't need the math

考虑到像PyTorch和TensorFlow这样的库是多么容易使用，它真的很诱人，"你不需要数学 *that much.* " But after doing the required reading for the two classes, I'm glad I have some previous math experience. A subset of topics from linear algebra, calculus, probability, statistics, and graph theory have already come up.

"但是在完成两门课程所需的阅读后，我很高兴我有一些以前的数学经验。线性代数，微积分，概率，统计学和图论的主题已经出现了。

Getting this knowledge at university would entail taking roughly 5 courses. Calculus 1, 2 and 3; linear algebra; and computer science 101. Luckily, you don't need each of those fields

在大学获得这些知识将需要大约5门课程。微积分1,2和3;线性代数;和计算机科学101.幸运的是，你不需要每个领域 **in their entirety.** Based on what I've seen so far, this is what I would recommend studying if you want to get into neural networks yourself:

基于我到目前为止看到的内容，如果您想亲自进入神经网络，我建议您研究这个：

From linear algebra, you need to know the

从线性代数中，你需要知道 **dot product, matrix multiplication (especially the rules for multiplying matrices with different sizes), and transposes.** You don't have to be able to do these things quickly by hand, but you should be comfortable enough to do small examples on a whiteboard or paper. You should also feel comfortable working with "multidimensional spaces" --- deep learning uses a lot of many dimensional vectors.

你不必手动快速完成这些事情，但你应该足够舒服，在白板或纸上做一些小例子。你应该对使用"多维空间"感到舒服 - 深度学习使用了很多维度向量。

I love

I love 3Blue1Brown's Essence of Linear Algebra for a refresher or an introduction into linear algebra. Additionally, compute a few dot products and matrix multiplications by hand (with small vector/matrix sizes). Although we use graph theory to model neural networks these graphs are represented in the computer by matrices and vectors for efficiency reasons. You should be comfortable both thinking about and programming with vectors and matrices.

进行复习或介绍线性代数。此外，手动计算一些点积和矩阵乘法（具有小矢量/矩阵大小）。虽然我们使用图论来建模神经网络，但出于效率原因，这些图表在计算机中通过矩阵和向量表示。你应该考虑使用向量和矩阵进行编程和编程。

From calculus you need to know the derivative, and you ideally should know it pretty well. Neural networks involve

从微积分你需要知道衍生物，理想情况下你应该很清楚它。神经网络涉及 **simple derivatives, the chain rule, partial derivatives, and the gradient** . The derivative is used by neural nets to solve

。神经网络使用导数来解决 **optimization problems** , so you should understand how the derivative can be used to find the "direction of greatest increase". A good intuition is probably enough, but if you

，所以你应该理解衍生物如何用来找到"最大增长的方向"。一个好的直觉可能就足够了，但如果你solve a couple simple optimization problems using the derivative, you'll be happy you did. 3Blue1Brown also has an

你会很开心的。 3Blue1Brown也有一个Essence of Calculus series, which is lovely as a more holistic review of calculus.

系列，作为一个更全面的微积分评论是可爱的。

Gradient descent and backpropagation both make heavy use of derivatives to fine tune the networks during training. You don't have to know how to solve big complex derivatives with compounding chain and product rules, but having a feel for partial derivatives with simple equations helps a lot.

梯度下降和反向传播都大量使用导数来在训练期间微调网络。您不必知道如何使用复合链和产品规则来解决大型复杂衍生物，但是对简单方程式的偏导数的感觉有很大帮助。

From probability and statistics, you should know about

从概率和统计数据来看，你应该知道common distributions, the idea of metrics,

，指标的想法，accuracy vs precision, and hypothesis testing. By far the most common applications of neural networks are to make predictions or judgements of some kind. Is this a picture of a dog? Will it rain tomorrow? Should I show Tyler

和假设检验。到目前为止，神经网络最常见的应用是做出某种预测或判断。这是一张狗的照片吗？明天会下雨吗？我应该向泰勒展示吗？ *this* advertisement, or

advertisement, or *that* one? Statistics and probability will help us assess the accuracy and usefulness of these systems.

一？统计和概率将帮助我们评估这些系统的准确性和有用性。

It's worth noting that the statistics appear more on the applied side; the graph theory, calculus, and linear algebra all appear on the implementation side. I think it's best to understand both, but if you're only going to be

值得注意的是，统计数据更多地出现在应用方面;图论，微积分和线性代数都出现在实现方面。我认为最好同时理解两者，但如果你只是这样 *using* a library like TensorFlow and are not interested in

像TensorFlow这样的图书馆并不感兴趣 *implementing* these algorithms yourself --- it might be wise to focus on the statistics more than the calculus & linear algebra.

这些算法本身 - 除了微积分和线性代数之外，关注统计数据可能更明智。

Finally, the graph theory. Honestly, if you can define the terms "vertex", "edge" and "edge weight" you've probably got enough graph theory under your belt. Even this "

最后，图论。老实说，如果你可以定义术语"顶点"，"边缘"和"边缘权重"，你可能已经掌握了足够的图论。即使这样"Gentle Introduction" has more information than you need.

"有比你需要的更多的信息。

In the next article in this series I'll be examining Deep Neural Networks and how they are constructed. See you then!

在本系列的下一篇文章中，我将研究深度神经网络以及它们是如何构建的。回头见！

Part 2: Deep Neural Networks as Computational Graphs

Part 3: Classifying MNIST Digits With Different Neural Network Architectures

公众号:银河系1号

公众号:银河系1号

联系邮箱：public@space-explore.com

联系邮箱：public@space-explore.com

(未经同意，请勿转载)

(未经同意，请勿转载)