面向数据科学家的 Conda#
Conda for data scientists
Conda 适用于各种打包流程,但它在数据科学领域的实用性使其在众多包和环境管理系统中脱颖而出。
Conda 的优势包括:
提供预构建的包,无需用户处理编译器或自行配置某个特定工具;
支持一键安装一些安装难度较高的工具(如 TensorFlow 或 IRAF);
能够跨平台共享环境,促进科研工作流程的可重复性;
在 Conda 环境中允许使用其他包管理工具(如 pip),以便安装那些尚未被打包进 Conda 的库或工具;
提供常用的数据科学库和工具,如 R、NumPy、SciPy 和 TensorFlow,这些包构建时使用了优化过的、针对特定硬件的库(例如 Intel 的 MKL 或 NVIDIA 的 CUDA),可在不改动代码的前提下显著提升性能。
Conda is useful for any packaging process but it stands out from other package and environment management systems through its utility for data science.
Conda’s benefits include:
Providing prebuilt packages which avoid the need to deal with compilers or figuring out how to set up a specific tool.
Managing one-step installation of tools that are more challenging to install (such as TensorFlow or IRAF).
Allowing you to provide your environment to other people across different platforms, which supports the reproducibility of research workflows.
Allowing the use of other package management tools, such as pip, inside conda environments where a library or tools are not already packaged for conda.
Providing commonly used data science libraries and tools, such as R, NumPy, SciPy, and TensorFlow. These are built using optimized, hardware-specific libraries (such as Intel’s MKL or NVIDIA’s CUDA) which speed up performance without code changes.