3 min read

Introduction to Git for Data Science 学习笔记

Introduction to Git for Data Science

大白话解释 Git 和 GitHub - 文章 - 伯乐在线 比较简洁解释了Git的功能,其实就是自动做文档修改的历史记录。

  • 4 hours
  • 0 Videos
  • 46 Exercises

Greg Wilson | DataCamp 这个哥们是教Git和Shell的,follow一下。 Git分享资料很方便,所以搞起来。

Where does Git store information? | Shell

Each of your Git projects has two parts:

  • the files and directories that you create and edit directly,
  • and the extra information that Git records about the project’s history. + The combination of these two things is called a repository.
  • 编写的文件和相关的路径
  • Git记录的额外信息, 保存为.git格式,在根目录中。
  • 两者合并为repository1

注意: 这个地方想要使用git status的命令,首先repository中必须要有.git文件。

How can I check the state of a repository? | Shell

$ cd dental
$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   report.txt

no changes added to commit (use "git add" and/or "git commit -a")

cd dental到指定的repository里, 然后用git status 查看被修改的文档。

JiaxiangdeMacBook-Air:~ JiaxiangLi$ cd Downloads/Downloads/blog_ppdai_1.1/
JiaxiangdeMacBook-Air:blog_ppdai_1.1 JiaxiangLi$ git status
fatal: Not a git repository (or any of the parent directories): .git

所以现在我还没建立这种文档!

How can I tell what I have changed? | Shell

所以一旦commit后,那么就不能后悔。 这个staging area,类似于中间位置,可改可撤销。 git status也就是表示处于staging area位置的文件,但是没有进行过commit。 git diff filenamegit diff+目前的路径、git diff directory表达的是修改了且commit的文档。

不是很有意思,先看吴恩达吧。 感觉第一课就不是很简单。

而且我觉得一定要在这里的电脑上试一下,单纯这种学习是学不会的。 所以首先现在电脑装好Git嘛。

What is in a diff? | Shell

diff --git a/report.txt b/report.txt
index e713b17..4c0742a 100644
--- a/report.txt
+++ b/report.txt
@@ -1,4 +1,4 @@
-# Seasonal Dental Surgeries 2017-18
+# Seasonal Dental Surgeries (2017) 2017-18

 TODO: write executive summary.

注意代码框中都是git ...的output,不是命令。 这里输入diff --git a/report.txt b/report.txt可以查看两份文件的差异。 --- a/report.txt+++ b/report.txt中,-表示减少,+表示增加。 @@表示修改的位置,-1,4表示1-4行删除,+1,4表示1-4行新增,换句话说,就是更改了1-4行。

What’s the first step in saving changes? | Shell

$ cd dental
$ pwd
/home/repl/dental
$ git add report.txt
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   report.txt

How can I tell what’s going to be committed? | Shell

$ cd dental
$ git add data/northern.csv
$ git diff -r HEAD
diff --git a/data/eastern.csv b/data/eastern.csv
index b3c1688..85053c3 100644
--- a/data/eastern.csv
+++ b/data/eastern.csv
@@ -23,3 +23,4 @@ Date,Tooth
 2017-08-02,canine
 2017-08-03,bicuspid
 2017-08-04,canine
+2017-11-02,molar
diff --git a/data/northern.csv b/data/northern.csv
index 5eb7a96..5a2a259 100644
--- a/data/northern.csv
+++ b/data/northern.csv
@@ -22,3 +22,4 @@ Date,Tooth
 2017-08-13,incisor
 2017-08-13,wisdom
 2017-09-07,molar
+2017-11-01,bicuspid

git diff -r HEAD path/to/file中,

  • -r: compare to a particular revision"
  • HEAD: the most recent commit

但是注意,这里有两个文档进行了修改, data/eastern.csvdata/northern.csv

Interlude: how can I edit a file? | Shell

  • Ctrl-K: delete a line.
  • Ctrl-U: un-delete a line.
  • Ctrl-O: save the file (‘O’ stands for ‘output’).
  • Ctrl-X: exit the editor.

Ctrl-KCtrl-U 用得少,多练习。

How do I commit changes? | Shell

$ cd dental
$ git add report.txt
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   report.txt

$ git commit -m "Adding a reference."
[master 25c629b] Adding a reference.
 1 file changed, 3 insertions(+)

git commit -m ""记录一点解释和废话。

https://music.163.com/song?id=523251118&userid=3615376

真好听。

How can I view a repository’s history? | Shell

git log反馈,

commit 0430705487381195993bac9c21512ccfb511056d
Author: Rep Loop <repl@datacamp.com>
Date:   Wed Sep 20 13:42:26 2017 +0000

    Added year to report title.

显然commit有唯一的ID。 用空格下一页和q退出。

最后的是最近。

How can I view a specific file’s history? | Shell

git log path 这里展示的是文档的新建和舍去,而非内容的变化。

$ cd dental
$ git log data/
eastern.csv   northern.csv  southern.csv  western.csv
$ git log data/southern.csv
.git/       bin/        data/       report.txt  results/
$ git log data/southern.csv
commit 17d5edca267575840a61985d0e428612bdef7130
Author: Rep Loop <repl@datacamp.com>
Date:   Tue Nov 21 23:13:31 2017 +0000

    Adding fresh data for southern and western regions.

commit 0e7d353aa17d19fdbb562c552212754e56fbc66f
Author: Rep Loop <repl@datacamp.com>
Date:   Tue Nov 21 23:13:31 2017 +0000

    Added seasonal CSV data files

How do I write a better log message? | Shell

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Your branch is up-to-date with 'origin/master'.
#
# Changes to be committed:
#       modified:   skynet.R
#

会有提示。


  1. repository 美音 /rɪ’pɑzə’tɔri/ 每次重读读错。