Python Data Science Tutorial

Data science” is just about as broad of a term as they come. It may be easiest to describe what it is by listing its more concrete components:

Data exploration & analysis.

  • Included here: Pandas; NumPy; SciPy; a helping hand from Python’s Standard Library.

Data visualization. A pretty self-explanatory name. Taking data and turning it into something colorful.

  • Included here: Matplotlib; Seaborn; Datashader; Others

Classical machine learning. Conceptually, we could define this as any supervised or unsupervised learning task that is not deep learning (see below). Scikit-learn is far-and-away the go-to tool for implementing classification, regression, clustering, and dimensionality reduction, while StatsModels is less actively developed but still has a number of useful features.

  • Included here: Scikit-Learn, StatsModels.

Data storage and big data frameworks. Big data is best defined as data that is either literally too large to reside on a single machine, or can’t be processed in the absence of a distributed environment. The Python bindings to Apache technologies play heavily here.

  • Apache Spark; Apache Hadoop; HDFS; Dask; h5py/pytables.

Odds and ends. Includes subtopics such as natural language processing, and image manipulation with libraries such as OpenCV.

  • Included here: nltk; Spacy; OpenCV/cv2; scikit-image; Cython.

Take your career to new heights of success with an Data Science, Enroll For Live Free Demo On Data Science Online Training

1reply Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
Like1 Follow
  • 1 mth agoLast active
  • 1Replies
  • 31Views
  • 2 Following