2. Popular tools used in data science
ā¢ Data pre-processing and analysis
Python, R, Microsoft Excel, SAS, SPSS
ā¢ Data exploration and visualization
Tbleau, Qlikview, MicrosoftExcel
ā¢ Parallel and distributed computing in case of bigdata
Apache Spark, Apache Hadoop
3. Evolution of Python
ā¢ Python was developed by Guidovan Rossum in the late
eighties at the
ā¢ āNational Research Institute for Mathematics and
Computer Scienceā at Netherlands
ā¢ Python Editions
ā¢ Python 1.0
ā¢ Python 2.0
ā¢ Python 3.0 and many more
4. Python as a programming language
ā¢ Standard C Python interpreter is managed by āPython
Software Foundationā
ā¢ There are other interpreters namely JPython(Java), Iron
Python (C#), Stackless Python (C, used for parallelism),
PyPy (Python itself JIT compilation)
ā¢ Standard libraries are written in python itself
ā¢ High standards of readability
5. Advantages of using python
ā¢ Python has several features that make it wel lsuited for
data science
ā¢ Open source and community development
ā¢ Developed under Open Source Initiative license making it
free to use and distribute even commercially
ā¢ Syntax used is simple to understand and code
ā¢ Libraries designed for specific data science tasks
ā¢ Combines well with majority of the cloud platform service
providers
6. Coding environment
ā¢ A software program can be written using a terminal, a
command prompt (cmd), a text editor or through an
Integrated Development Environment (IDE)
ā¢ The program needs to be saved in a file with an
appropriate extension (.py for python, .mat for matlab,
etc...) and can be executed in corresponding environment
(Python, Matlab, etcā¦)
ā¢ Integrated Development Environment (IDE) is a software
product solely developed to support software
development in various or specific programming
language(s)
7. Coding environment
ā¢ Python 2.x support will be available till 2020
ā¢ Python 3.x is an enhanced version of 2.x and will only be
maintained from 3.6.x post 2020
ā¢ Install basic python version or use the online python
console as in https://www.python.org/
ā¢ Execute following commands and view the outputs in
terminal or command prompt
ā¢ ā¢Basic print statement
ā¢ ā¢Naming conventions for variables and functions,
operators
ā¢ ā¢Conditional operations, looping statements (nested)
ā¢ ā¢Function declaration and calling
ā¢ ā¢Installing modules
8. Integrated development environment
(IDE)
ā¢ Software application consisting of a cohesive unit of tools
required for development
ā¢ Designed to simplify software development
ā¢ Utilities provided by IDEs include tools for managing,
compiling, deploying and debugging software
9. Coding environment-IDE
ā¢ An IDE usually comprises of
ā¢ ā¦Source code editor
ā¢ ā¦Compiler
ā¢ ā¦Debugger
ā¢ ā¦Additional features include syntax and error highlighting,
code completion
ā¢ Offers supports in building and executing the program
along with debugging the code from within the
environment
10. Coding environment-IDE
ā¢ Best IDEs provide version control features
ā¢ Eclipse + Py Dev, Sublime Text, Atom, GNU Emacs,
Vi/Vim, Visual Studio, Visual Studio Code are general
IDEs with python support
ā¢ Apart from these some of the python specific editors
include
ā¢ Pycharm,
ā¢ Jupyter,
ā¢ Spyder,
ā¢ Thonny
11. PyCharm
ā¢ Supported across Linux, MacOSX and Windowsplatforms
ā¢ Available as community (free open source) and
professional (paid) version
ā¢ Supports only Python
ā¢ Can be installed separately or through Anaconda
distribution
ā¢ Features include
ā¢ ā¦Code editor provides syntax and error highlighting
ā¢ ā¦Code completion and navigation
ā¢ ā¦Unit testing
ā¢ ā¦Debugger
ā¢ ā¦Versioncontrol
13. Jupyter Notebook
ā¢ Web application that allows creation and manipulation of
documents called ānotebookā
ā¢ Supported across Linux, MacOSX and Windowsplatforms
ā¢ Available as open source version
15. Jupyter Notebook
ā¢ Bundled with Anaconda distribution or can be installed
separately
ā¢ Supports Julia, Python, R and Scala
ā¢ Consists of ordered collection of input and output cells
that contain code, text, plots etc.
ā¢ Allows sharing of code and narrative text through output
formats like PDF, HTML etc.
ā¢ ā¦Education and presentation tool
ā¢ Lack most of the features of a good IDE
16. Spyder
ā¢ Supported across Linux, Mac-OSX and Windows
platforms
ā¢ Available as open source version
ā¢ Can be installed separately or through Anaconda
distribution
ā¢ Developed for Python and specifically data science
ā¢ Features include
ā¢ ā¦Code editor with robust syntax and error highlighting
ā¢ ā¦Code completion and navigation
ā¢ ā¦Debugger
ā¢ ā¦Integrated document
ā¢ Interface similar to MATLAB and RStudio