graph LR
Core_Utilities["Core Utilities"]
Data_Management_Preprocessing["Data Management & Preprocessing"]
Model_Training_Evaluation["Model Training & Evaluation"]
Natural_Language_Processing_NLP_Models_Components["Natural Language Processing (NLP) Models & Components"]
Computer_Vision_CV_Models_Components["Computer Vision (CV) Models & Components"]
Data_Management_Preprocessing -- "calls" --> Core_Utilities
Model_Training_Evaluation -- "calls" --> Core_Utilities
Data_Management_Preprocessing -- "provides input data to" --> Model_Training_Evaluation
Data_Management_Preprocessing -- "prepares data for" --> Natural_Language_Processing_NLP_Models_Components
Model_Training_Evaluation -- "trains" --> Natural_Language_Processing_NLP_Models_Components
Model_Training_Evaluation -- "trains" --> Computer_Vision_CV_Models_Components
Natural_Language_Processing_NLP_Models_Components -- "consumes data from" --> Data_Management_Preprocessing
Natural_Language_Processing_NLP_Models_Components -- "is trained by" --> Model_Training_Evaluation
Computer_Vision_CV_Models_Components -- "consumes data from" --> Data_Management_Preprocessing
Computer_Vision_CV_Models_Components -- "is trained by" --> Model_Training_Evaluation
click Core_Utilities href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//d2l-zh/Core_Utilities.md" "Details"
click Model_Training_Evaluation href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//d2l-zh/Model_Training_Evaluation.md" "Details"
click Natural_Language_Processing_NLP_Models_Components href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//d2l-zh/Natural_Language_Processing_NLP_Models_Components.md" "Details"
click Computer_Vision_CV_Models_Components href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//d2l-zh/Computer_Vision_CV_Models_Components.md" "Details"
The d2l-zh project is structured to provide a comprehensive and framework-agnostic educational resource for deep learning. The architecture is designed to be modular, allowing for clear separation of concerns across different stages of a deep learning pipeline and across various deep learning frameworks (MXNet, PaddlePaddle, TensorFlow, PyTorch).
This component provides fundamental, cross-cutting helper functions essential for the entire project. These include utilities for visualization (e.g., set_figsize, plot), timing operations (Timer), and general data handling (e.g., download_extract, download_all). These utilities are crucial for setting up experiments, displaying results, and managing datasets, regardless of the specific deep learning task or framework.
Related Classes/Methods:
This component is responsible for the entire data pipeline, from loading raw datasets to transforming them into a format suitable for deep learning models. It encompasses dataset loading for various domains (e.g., Fashion MNIST, Time Machine, IMDB, VOC), text tokenization, vocabulary creation, and handling image data for computer vision tasks. It acts as the primary data provider for all downstream model components.
Related Classes/Methods:
d2l.mxnet(1:1)d2l.paddle(1:1)d2l.tensorflow(1:1)d2l.torch(1:1)d2lzh.utils(1:1)d2lzh.text.vocab.Vocabulary(1:1)d2lzh.utils.get_tokenized_imdb(187:191)d2lzh.utils.preprocess_imdb(341:352)d2l.mxnet.VOCSegDataset(1819:1849)d2l.paddle.VOCSegDataset(1931:1963)d2l.tensorflow.VOCSegDataset(1:1)d2l.torch.VOCSegDataset(1924:1954)d2lzh.utils.VOCSegDataset(809:836)
This component encapsulates the core logic for training and evaluating deep learning models across different tasks. It handles the iterative process of training, including iterating over epochs, processing mini-batches, computing loss, performing backpropagation, and updating model parameters. It also includes functionalities for calculating and reporting performance metrics (e.g., accuracy) on validation or test datasets.
Related Classes/Methods:
d2l.mxnet(1:1)d2l.mxnet.evaluate_accuracy(218:225)d2l.paddle(1:1)d2l.paddle.evaluate_accuracy(244:254)d2l.tensorflow(1:1)d2l.tensorflow.evaluate_accuracy(213:220)d2l.torch(1:1)d2l.torch.evaluate_accuracy(233:243)d2lzh.utils.train(532:556)d2lzh.utils.train_and_predict_rnn(571:613)d2lzh.utils.evaluate_accuracy(149:161)
This comprehensive component covers all aspects of text processing and modeling within the project. It includes functionalities for building vocabularies and creating/loading token embeddings (e.g., GloVe, Word2Vec) to convert text into numerical representations. Furthermore, it implements various sequence models like Recurrent Neural Networks (RNNs), the advanced Transformer architecture, and the powerful BERT model, along with their foundational Attention Mechanisms. This component provides the specialized tools and models required for language-related tasks such as language modeling, machine translation, and text classification.
Related Classes/Methods:
d2l.mxnet.Vocab(512:554)d2l.mxnet.tokenize(501:510)d2l.mxnet.TokenEmbedding(2058:2091)d2l.paddle.Vocab(573:615)d2l.paddle.tokenize(562:571)d2l.paddle.TokenEmbedding(2188:2221)d2l.tensorflow.Vocab(532:574)d2l.tensorflow.tokenize(521:530)d2l.tensorflow.TokenEmbedding(1:1)d2l.torch.Vocab(560:602)d2l.torch.tokenize(549:558)d2l.torch.TokenEmbedding(2179:2212)d2lzh.text.vocab.Vocabulary(1:1)d2lzh.text.embedding.TokenEmbedding(56:94)d2l.mxnet.predict_ch8(661:675)d2l.mxnet.train_ch8(714:740)d2l.paddle.predict_ch8(722:735)d2l.paddle.train_ch8(788:811)d2l.tensorflow.predict_ch8(683:697)d2l.tensorflow.train_ch8(742:764)d2l.torch.predict_ch8(709:723)d2l.torch.train_ch8(773:796)d2lzh.utils.predict_rnn(304:316)d2lzh.utils.train_and_predict_rnn(571:613)d2l.mxnet.AdditiveAttention(1042:1068)d2l.mxnet.DotProductAttention(1070:1087)d2l.mxnet.MultiHeadAttention(1100:1137)d2l.paddle.AdditiveAttention(1150:1174)d2l.paddle.DotProductAttention(1176:1193)d2l.paddle.MultiHeadAttention(1206:1241)d2l.tensorflow.AdditiveAttention(1079:1105)d2l.tensorflow.DotProductAttention(1107:1124)d2l.tensorflow.MultiHeadAttention(1137:1174)d2l.torch.AdditiveAttention(1144:1168)d2l.torch.DotProductAttention(1170:1187)d2l.torch.MultiHeadAttention(1200:1238)d2l.mxnet.EncoderBlock(1208:1223)d2l.mxnet.TransformerEncoder(1225:1251)d2l.paddle.EncoderBlock(1314:1332)d2l.paddle.TransformerEncoder(1334:1362)d2l.tensorflow.EncoderBlock(1246:1261)d2l.tensorflow.TransformerEncoder(1263:1290)d2l.torch.EncoderBlock(1311:1329)d2l.torch.TransformerEncoder(1331:1359)d2l.mxnet.BERTModel(2167:2189)d2l.mxnet.MaskLM(2130:2153)d2l.mxnet.NextSentencePred(2155:2165)d2l.paddle.BERTModel(2300:2328)d2l.paddle.MaskLM(2264:2286)d2l.paddle.NextSentencePred(2288:2298)d2l.tensorflow.BERTModel(1:1)d2l.tensorflow.MaskLM(1:1)d2l.tensorflow.NextSentencePred(1:1)d2l.torch.BERTModel(2290:2318)d2l.torch.MaskLM(2254:2276)d2l.torch.NextSentencePred(2278:2288)
This component focuses on image processing and computer vision tasks. It includes implementations of popular convolutional neural network (CNN) architectures like ResNet-18 for image classification. Additionally, it provides specialized utilities for object detection (e.g., displaying bounding boxes, IoU calculation, NMS) and image segmentation (e.g., VOC dataset handling). This component provides the necessary models and helper functions to work with visual data.
Related Classes/Methods:
d2l.mxnet.resnet18(1371:1394)d2l.paddle.resnet18(1486:1514)d2l.tensorflow.resnet18(1:1)d2l.torch.resnet18(1483:1511)d2lzh.utils.resnet18(405:425)d2l.mxnet.show_bboxes(1535:1556)d2l.mxnet.box_iou(1558:1580)d2l.mxnet.assign_anchor_to_bbox(1582:1605)d2l.mxnet.multibox_target(1618:1649)d2l.mxnet.nms(1662:1676)d2l.mxnet.multibox_detection(1678:1710)d2l.paddle.show_bboxes(1641:1662)d2l.paddle.box_iou(1664:1685)d2l.paddle.assign_anchor_to_bbox(1687:1711)d2l.paddle.multibox_target(1724:1755)d2l.paddle.nms(1768:1782)d2l.paddle.multibox_detection(1784:1817)d2l.tensorflow.show_bboxes(1:1)d2l.tensorflow.box_iou(1:1)d2l.tensorflow.assign_anchor_to_bbox(1:1)d2l.tensorflow.multibox_target(1:1)d2l.tensorflow.nms(1:1)d2l.tensorflow.multibox_detection(1:1)d2l.torch.show_bboxes(1637:1658)d2l.torch.box_iou(1660:1681)d2l.torch.assign_anchor_to_bbox(1683:1707)d2l.torch.multibox_target(1720:1752)d2l.torch.nms(1765:1779)d2l.torch.multibox_detection(1781:1813)d2lzh.utils.show_bboxes(471:483)d2l.mxnet.VOCSegDataset(1819:1849)d2l.paddle.VOCSegDataset(1931:1963)d2l.tensorflow.VOCSegDataset(1:1)d2l.torch.VOCSegDataset(1924:1954)d2lzh.utils.VOCSegDataset(809:836)