Python培训

400-996-5531

热门课程:

Python人工智能培训 > Python练习题库 > 正文

Mac 下如何搭建管理 Python 开发环境

发布： mlbee
来源：机器学习小蜜蜂
时间：2017-12-08 17:06

-- 写在机器学习实战之前

为了更好的学习机器学习，本篇将介绍 mac 下如何搭建管理 Python 开发环境。使用 Anaconda 进行 Python2/3 隔离和科学计算包管理;使用 Jupyter Notebook 进行 Python 开发。

重置 macOS Python

macOS Sierra 自带的版本是 Python2，通常位于 /usr/bin/python, 即便是 root 权限，也无法删除。当然，由于诸多系统软件依赖 Python2，也不建议删除。

Python 允许多版本共存，并且目前 Python 有众多包管理利器，诸如：celler、pip，aconda 等。

为了更好地管理 Python 环境，我们先把其他各个渠道的 Python 版本，统统删掉(通常位于 /usr/local/bin)。

以 Python2.7 为例，具体步骤：

1. 删除 Python 2.7 Framework

确定删除 /usr/local/bin/python。进入 /usr/local/bin，确定真身。

原来，位于 ../Cellar/python, 删除之。

2. 清除软链接

3. 清除相关 profile files 中的 PATH 等环境变量

安装 Anaconda

Anaconda 是什么?

Anaconda is the leading open data science platform powered by Python. The open source version of Anaconda is a high performance distribution of Python and R and includes over 100 of the most popular Python, R and Scala packages for data science.

Additionally, you'll have access to over 720 packages that can easily be installed with conda, our renowned package, dependency and environment manager, that is included in Anaconda. See the packages included with Anaconda and the Anaconda changelog。

Anaconda 其实用于科学计算的 Python 发行版 (不仅限于 Python)，集成了100多个科学计算包及其依赖。

Conda

Anaconda 集成了 Conda, Conda 解决了Python的不同版本隔离(环境管理)和包管理。

环境管理

可以看出安装的 env 都放在 ~/anaconda/envs 路径下。

包管理

值得注意的是，conda 将 python、conda 本身看成 package，及其方便管理。

添加镜像

可以在 anaconda 查看，aconda 可视化配置

Spark 环境搭建

Spark 安装非常简单，直接解压，just run it。按照习惯，在 /usr/local/bin 创建一个软链。

sudo ln -s /Applications/spark-2.1.0/bin/pyspark pyspark

需要注意的两点：

1. Spark runs on Java 7+, Python 2.6+/3.4+。但是 Python 3.6.0 有问题(详见：Unable to run pyspark &PySpark does not work with Python 3.6.0 )。

2. 配置 /ect/hosts 将本机挂到 localhost 上面去。

IPython and Jupyter Notebooks

PYSPARK_PYTHON: 指定 python 的版本

PYSPARK_DRIVER_PYTHON:指定 python 的 driver

PYSPARK_DRIVER_PYTHON_OPTS: 指定 python 的 driver note

命令：

PYSPARK_PYTHON=python PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark

在jupyter notebook 中完美执行。

测试代码：计算词频

%matplotlib inline

%numpy inline

import numpy as np

import matplotlib.pyplot as plt

textFile = sc.textFile("test.note")

textFile.count()

wordCounts = textFile.flatMap(lambda line: line.split()).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a+b)

wordCountDict = dict(wordCounts.take(10))

bar_width = 0.35

opacity = 0.4

n_groups = len(wordCountDict.keys())

fig, ax = plt.subplots()

index = np.arange(n_groups)

print index + bar_width

plt.bar(index, tuple(wordCountDict.values()), bar_width, alpha=opacity, color='b')

plt.xlabel('Word')

plt.ylabel('Count')

plt.title('WordCount')

plt.xticks(index + bar_width, tuple(wordCountDict.keys()) )

plt.ylim(0, 50)

plt.legend()

plt.tight_layout()

plt.show()

参考：

1. How to uninstall Python 2.7 on a Mac OS X 10.6.4?

2. Using Jupyter on Apache Spark: Step-by-Step with a Terabyte of Reddit Data

3. Running Spark Applications Using IPython and Jupyter Notebooks

(注：本文是作者一篇旧文，#/land-ml/python-env-best-prictice/)

本文内容转载自网络，本着分享与传播的原则，版权归原作者所有，如有侵权请联系我们进行删除!

预约申请免费试听课

填写下面表单即可预约申请免费试听！怕学不会？助教全程陪读，随时解惑！担心就业？一地学习，可全国推荐就业！

上一篇：十五分钟学会用python编写小游戏

下一篇：Python实现一个带图形界面的爬虫

相关推荐

: Python IDE推荐7个你可能会错过的Python IDE

现在为Python构建的IDE真的是多的尴尬。IDLE, Komodo, LiClipse, PyCharm, Spyder, and Python Tools这六个产品因为其适用性良好，在一个应用评审中挺进最后一轮，但还有更多值得注意的IDE--有通过复杂的努力为提供的一个完整的开发系统，以及适合初学者的项目。