Gen TANG 1 år sedan
förälder
incheckning
4cce04b2d5
4 ändrade filer med 72 tillägg och 0 borttagningar
  1. 18 0
      video/deep_rnn.ipynb
  2. 18 0
      video/lstm.ipynb
  3. 18 0
      video/mlp_nlp.ipynb
  4. 18 0
      video/rnn_nlp.ipynb

+ 18 - 0
video/deep_rnn.ipynb

@@ -132,6 +132,24 @@
     "device = 'cuda' if torch.cuda.is_available() else 'cpu'"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "感谢Wanghaha(xufengnian-bei)的贡献,如果在下载过程中遇到网络问题,请使用下面的步骤进行处理。\n",
+    "\n",
+    "* 访问 Hugging Face 数据集页面: https://huggingface.co/datasets/code_search_net\n",
+    "* 在页面上找到 \"Files and versions\" 部分。\n",
+    "* 点击data文件夹,下载对应的python.zip\n",
+    "\n",
+    "修改对应下载文件代码:\n",
+    "\n",
+    "datasets = load_dataset('json', data_files='data/python/python/final/jsonl/train/*.jsonl.gz') # 更换为自己的目录\n",
+    "datasets = datasets['train'].filter(lambda x: 'apache/spark' in x['repo']) # 这里repository_name 更换为 repo\n",
+    "\n",
+    "print(datasets[8]['original_string']) # whole_func_string 更换为 original_string"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 4,

+ 18 - 0
video/lstm.ipynb

@@ -134,6 +134,24 @@
     "device = 'cuda' if torch.cuda.is_available() else 'cpu'"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "感谢Wanghaha(xufengnian-bei)的贡献,如果在下载过程中遇到网络问题,请使用下面的步骤进行处理。\n",
+    "\n",
+    "* 访问 Hugging Face 数据集页面: https://huggingface.co/datasets/code_search_net\n",
+    "* 在页面上找到 \"Files and versions\" 部分。\n",
+    "* 点击data文件夹,下载对应的python.zip\n",
+    "\n",
+    "修改对应下载文件代码:\n",
+    "\n",
+    "datasets = load_dataset('json', data_files='data/python/python/final/jsonl/train/*.jsonl.gz') # 更换为自己的目录\n",
+    "datasets = datasets['train'].filter(lambda x: 'apache/spark' in x['repo']) # 这里repository_name 更换为 repo\n",
+    "\n",
+    "print(datasets[8]['original_string']) # whole_func_string 更换为 original_string"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 4,

+ 18 - 0
video/mlp_nlp.ipynb

@@ -416,6 +416,24 @@
     "eval_iters = 10"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "感谢Wanghaha(xufengnian-bei)的贡献,如果在下载过程中遇到网络问题,请使用下面的步骤进行处理。\n",
+    "\n",
+    "* 访问 Hugging Face 数据集页面: https://huggingface.co/datasets/code_search_net\n",
+    "* 在页面上找到 \"Files and versions\" 部分。\n",
+    "* 点击data文件夹,下载对应的python.zip\n",
+    "\n",
+    "修改对应下载文件代码:\n",
+    "\n",
+    "datasets = load_dataset('json', data_files='data/python/python/final/jsonl/train/*.jsonl.gz') # 更换为自己的目录\n",
+    "datasets = datasets['train'].filter(lambda x: 'apache/spark' in x['repo']) # 这里repository_name 更换为 repo\n",
+    "\n",
+    "print(datasets[8]['original_string']) # whole_func_string 更换为 original_string"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 15,

+ 18 - 0
video/rnn_nlp.ipynb

@@ -128,6 +128,24 @@
     "device = 'cuda' if torch.cuda.is_available() else 'cpu'"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "感谢Wanghaha(xufengnian-bei)的贡献,如果在下载过程中遇到网络问题,请使用下面的步骤进行处理。\n",
+    "\n",
+    "* 访问 Hugging Face 数据集页面: https://huggingface.co/datasets/code_search_net\n",
+    "* 在页面上找到 \"Files and versions\" 部分。\n",
+    "* 点击data文件夹,下载对应的python.zip\n",
+    "\n",
+    "修改对应下载文件代码:\n",
+    "\n",
+    "datasets = load_dataset('json', data_files='data/python/python/final/jsonl/train/*.jsonl.gz') # 更换为自己的目录\n",
+    "datasets = datasets['train'].filter(lambda x: 'apache/spark' in x['repo']) # 这里repository_name 更换为 repo\n",
+    "\n",
+    "print(datasets[8]['original_string']) # whole_func_string 更换为 original_string"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 4,