📉
Tutorials
  • Computer History
  • Function
    • Finance
      • Calculate
    • Manage Data
    • Date&Time
    • Strings and Character
  • Snippets
    • Web Application
      • Hugo
      • JavaScript
        • Stopwatch using JavaScript?
    • Note
    • Start Project
      • GitHub
      • GitLab
    • Python Programming
      • Strings and Character Data
      • List
      • Dictionaries
    • Data Science
      • Setting Option
      • Get Data
  • Link Center
    • Next Articles
    • Google
    • Excel VBA
    • Python
      • Notebook
    • WebApp
      • Vue.js
    • Finance
    • Project
      • Kids
        • Scratch
      • Finance
        • Plotly.js
        • Portfolio
      • Mini Lab
        • Systems Administration
        • Auto Adjust Image
      • Sending Emails
      • ECS
        • Knowledge Base
        • ระบบผู้เชี่ยวชาญ (Expert System)
        • Check product
        • Compare two SQL databases
      • e-Library
        • Knowledge base
        • การจัดหมวดหมู่ห้องสมุด
        • Temp
      • AppSheet
        • บัญชีรายรับรายจ่าย
      • Weather App
      • COVID-19
  • Tutorials
    • Data Science
      • Data Science IPython notebooks
    • UX & UI
      • 7 กฎการออกแบบ UI
    • Web Scraping
      • Scrape Wikipedia Articles
      • Quick Start
    • GUI
      • pysimple
        • Create a GUI
      • Tkinter
        • Python Tkinter Tutorial
      • PyQt
        • PyQt Tutorial
    • MachineLearning
      • การพัฒนา Chat Bot
      • AI ผู้ช่วยใหม่ในการทำ Customer Segmentation
      • Customer Segmentation
      • ตัดคำภาษาไทย ด้วย PyThaiNLP API
    • Excel & VBA
      • INDEX กับ MATCH
      • รวมสูตร Excel ปี 2020
      • How to Write Code in a Spreadsheet
    • Visualization
      • Bokeh
        • Part I: Getting Started
        • Data visualization
        • Plotting a Line Graph
        • Panel Document
        • Interactive Data Visualization
    • VueJS
      • VueJS - Quick Guide
    • Django
      • Customize the Django Admin
      • พัฒนาเว็บด้วย Django
    • Git
      • วิธีสร้าง SSH Key
      • Git คืออะไร
      • เริ่มต้นใช้งาน Git
      • การใช้งาน Git และ Github
      • รวม 10 คำสั่ง Git
      • GIT Push and Pull
    • Finance
      • Stock Analysis using Pandas (Series)
      • Building Investment AI for fintech
      • Resampling Time Series
      • Python for Finance (Series)
      • Stock Data Analysis (Second Edition)
      • Get Stock Data Using Python
      • Stock Price Trend Analysis
      • Calculate Stock Returns
      • Quantitative Trading
      • Backtrader for Backtesting
      • Binance Python API
      • Pine Script (TradingView)
      • Stocks Analysis with Pandas and Scikit-Learn
      • Yahoo Finance API
      • Sentiment Analysis
      • yfinance Library
      • Stock Data Analysis
      • YAHOO_FIN
      • Algorithmic Trading
    • JavaScript
      • Split a number
      • Callback Function
      • The Best JavaScript Examples
      • File and FileReader
      • JavaScript Tutorial
      • Build Reusable HTML Components
      • Developing JavaScript components
      • JavaScript - Quick Guide
      • JavaScript Style Guide()
      • Beginner's Handbook
      • Date Now
    • Frontend
      • HTML
        • File Path
      • Static Site Generators.
        • Creating a New Theme
    • Flask
      • Flask - Quick Guide
      • Flask Dashboards
        • Black Dashboard
        • Light Blue
        • Flask Dashboard Argon
      • Create Flask App
        • Creating First Application
        • Rendering Pages Using Jinja
      • Jinja Templates
        • Primer on Jinja Templating
        • Jinja Template Document
      • Learning Flask
        • Ep.1 Your first Flask app
        • Ep.2 Flask application structure
        • Ep.3 Serving HTML files
        • Ep.4 Serving static files
        • Ep.5 Jinja template inheritance
        • Ep.6 Jinja template design
        • Ep.7 Working with forms in Flask
        • Ep.8 Generating dynamic URLs in Flask
        • Ep.9 Working with JSON data
        • Ep.23 Deploying Flask to a VM
        • Ep.24 Flask and Docker
        • Ep. 25: uWSGI Introduction
        • Ep. 26 Flask before and after request
        • Ep. 27 uWSGI Decorators
        • Ep. 28 uWSGI Decorators
        • Ep. 29 Flask MethodView
        • Ep. 30 Application factory pattern
      • The Flask Mega-Tutorial
        • Chapter 2: Templates
      • Building Flask Apps
      • Practical Flask tutorial series
      • Compiling SCSS to CSS
      • Flask application structure
    • Database
      • READING FROM DATABASES
      • SQLite
        • Data Management
        • Fast subsets of large datasets
      • Pickle Module
        • How to Persist Objects
      • Python SQL Libraries
        • Create Python apps using SQL Server
    • Python
      • Python vs JavaScript
      • Python Pillow – Adjust Image
      • Python Library for Google Search
      • Python 3 - Quick Guide
      • Regular Expressions
        • Python Regular Expressions
        • Regular Expression (RegEx)
        • Validate ZIP Codes
        • Regular Expression Tutorial
      • Python Turtle
      • Python Beginner's Handbook
      • From Beginner to Pro
      • Standard Library
      • Datetime Tutorial
        • Manipulate Times, Dates, and Time Spans
      • Work With a PDF
      • geeksforgeeks.org
        • Python Tutorial
      • Class
      • Modules
        • Modules List
        • pickle Module
      • Working With Files
        • Open, Read, Append, and Other File Handling
        • File Manipulation
        • Reading & Writing to text files
      • Virtual Environments
        • Virtual Environments made easy
        • Virtual Environmen
        • A Primer
        • for Beginners
      • Functions
        • Function Guide
        • Inner Functions
      • Learning Python
        • Pt. 4 Python Strings
        • Pt. 3 Python Variables
      • Zip Function
      • Iterators
      • Try and Except
        • Exceptions: Introduction
        • Exceptions Handling
        • try and excep
        • Errors and Exceptions
        • Errors & Exceptions
      • Control Flow
      • Lambda Functions
        • Lambda Expression คืออะไร
        • map() Function
      • Date and Time
        • Python datetime()
        • Get Current Date and Time
        • datetime in Python
      • Awesome Python
      • Dictionary
        • Dictionary Comprehension
        • ALL ABOUT DICTIONARIES
        • DefaultDict Type for Handling Missing Keys
        • The Definitive Guide
        • Why Functions Modify Lists and Dictionaries
      • Python Structures
      • Variable & Data Types
      • List
        • Lists Explained
        • List Comprehensions
          • Python List Comprehension
          • List Comprehensions in 5-minutes
          • List Comprehension
        • Python List
      • String
        • Strings and Character Data
        • Splitting, Concatenating, and Joining Strings
      • String Formatting
        • Improved String Formatting Syntax
        • String Formatting Best Practices
        • Remove Space
        • Add Spaces
      • Important basic syntax
      • List all the packages
      • comment
    • Pandas
      • Tutorial (GeeksforGeeks)
      • 10 minutes to pandas
      • Options and settings
      • เริ่มต้น Set Up Kaggle.com
      • Pandas - Quick Guide
      • Cookbook
      • NumPy
        • NumPy Package for Scientific
      • IO tools (text, CSV, …)
      • pandas.concat
      • Excel & Google Sheets
        • A Guide to Excel
        • Quickstart to the Google Sheets
        • Python Excel Tutorial: The Definitive Guide
      • Working With Text Data
        • Quickstart
      • API Reference
      • Groupby
      • DateTime Methods
      • DataFrame
      • sort_values()
      • Pundit: Accessing Data in DataFrames
      • datatable
        • DataFrame: to_json()
        • pydatatable
      • Read and Write Files
      • Data Analysis with Pandas
      • Pandas and Python: Top 10
      • 10 minutes to pandas
      • Getting Started with Pandas in Python
    • Markdown
      • Create Responsive HTML Emails
      • Using Markup Languages with Hugo
    • AngularJS
      • Learn AngularJS
    • CSS
      • The CSS Handbook
      • Box Shadow
      • Image Center
      • The CSS Handbook
      • The CSS Handbook
      • Loading Animation
      • CSS Grid Layout
      • Background Image Size
      • Flexbox
  • Series
    • จาวาสคริปต์เบื้องต้น
      • 1: รู้จักกับจาวาสคริปต์
  • Articles
    • Visualization
      • Dash
        • Introducing Dash
    • Finance
      • PyPortfolioOpt
      • Best Libraries for Finance
      • Detection of price support
      • Portfolio Optimization
      • Python Packages For Finance
    • Django
      • เริ่มต้น Django RestFramework
    • General
      • Heroku คืออะไร
      • How to Crack Passwords
    • Notebook
      • IPython Documentation
      • Importing Notebooks
      • Google Colab for Data Analytics
      • Creating Interactive Dashboards
      • The Definitive Guide
      • A gallery of interesting Jupyter Notebooks
      • Advanced Jupyter Notebooks
      • Converting HTML to Notebook
    • Pandas
      • Pandas_UI
      • Pandas Style API
      • Difference Between two Dataframes
      • 19 Essential Snippets in Pandas
      • Time Series Analysis
      • Selecting Columns in a DataFrame
      • Cleaning Up Currency Data
      • Combine Multiple Excel Worksheets
      • Stylin’ with Pandas
      • Pythonic Data Cleaning
      • Make Excel Faster
      • Reading Excel (xlsx) Files
      • How to use iloc and loc for Indexing
      • The Easiest Data Cleaning Method
    • Python
      • pip install package
      • Automating your daily tasks
      • Convert Speech to Text
      • Tutorial, Project Ideas, and Tips
      • Image Handling and Processing
        • Image Processing Part I
        • Image Processing Part II
        • Image tutorial
        • Image Processing with Numpy
        • Converts PIL Image to Numpy Array
      • Convert Dictionary To JSON
      • JSON Dump
      • Speech-to-Text Model
      • Convert Text to Speech
      • Tips & Tricks
        • Fundamentals for Data Science
        • Best Python Code Examples
        • Top 50 Tips & Tricks
        • 11 Beginner Tips
        • 10 Tips & Tricks
      • Password hashing
      • psutil
      • Lambda Expressions
    • Web Scraping
      • Web Scraping using Python
      • Build a Web Scraper
      • Web Scraping for beginner
      • Beautiful Soup
      • Scrape Websites
      • Python Web Scraping
        • Web Scraping Part 1
        • Web Scraping Part 2
        • Web Scraping Part 3
        • Web Scraping Part 4
      • Web Scraper
    • Frontend
      • Book Online with GitBook
      • Progressive Web App คืออะไร
      • self-host a Hugo web app
  • Examples
    • Django
      • Build a Portfolio App
      • SchoolManagement
    • Flask
      • Flask Stock Visualizer
      • Flask by Example
      • Building Flask Apps
      • Flask 101
    • OpenCV
      • Build a Celebrity Look-Alike
      • Face Detection-OpenCV
    • Python
      • Make Game FLASH CARD
      • Sending emails using Google
      • ตรวจหาภาพซ้ำด้วย Perceptual hashing
        • Sending Emails in Python
      • Deck of Cards
      • Extract Wikipedia Data
      • Convert Python File to EXE
      • Business Machine Learning
      • python-business-analytics
      • Simple Blackjack Game
      • Python Turtle Clock
      • Countdown
      • 3D Animation : Moon Phases
      • Defragmentation Algorithm
      • PDF File
        • จัดการข้อความ และรูป จากไฟล์ PDF ด้วย PDFBox
      • Reading and Generating QR codes
      • Generating Password
        • generate one-time password (OTP)
        • Random Password Generator
        • Generating Strong Password
      • PyQt: Building Calculator
      • List Files in a Directory
      • [Project] qID – โปรแกรมแต่งรูปง่ายๆ เพื่อการอัพลงเว็บ
      • Python and Google Docs to Build Books
      • Tools for Record Linking
      • Create Responsive HTML Email
      • psutil()
      • Transfer Learning for Deep Learning
      • ดึงข้อมูลคุณภาพอากาศประเทศไทย
        • Image Classification
    • Web Scraper
      • Scrape Wikipedia Articles
        • Untitled
      • How Scrape Websites with Python 3
    • Finance
      • Algorithmic Trading for Beginners
      • Parse TradingView Stock
      • Creating a stock price database with MariaDB and python
      • Source Code
        • stocks-list
      • Visualizing with D3
      • Real Time Stock in Excel using Python
      • Create Stock Quote Module
      • The Magic Formula Lost Its Sparkle?
      • Stock Market Analysis
      • Stock Portfolio Analyses Part 1
      • Stock Portfolio Analyses Part 2
      • Build A Dashboard In Python
      • Stock Market Predictions with LSTM
      • Trading example
      • Algorithmic Trading Strategies
      • DOWNLOAD FUNDAMENTALS DATA
      • Algorithmic Trading
      • numfin
      • Financial Machine Learning
      • Algorithm To Predict Stock Direction
      • Interactive Brokers API Code
      • The (Artificially) Intelligent Investor
      • Create Auto-Updating Excel of Stock Market
      • Stock Market Predictions
      • Automate Your Stock Portfolio
      • create an analytics dashboard
      • Bitcoin Price Notifications
      • Portfolio Management
    • WebApp
      • CSS
        • The Best CSS Examples
      • JavaScript
        • Memory Game
      • School Clock
      • Frontend Tutorials & Example
      • Side Menu Bar with sub-menu
      • Create Simple CPU Monitor App
      • Vue.js building a converter app
      • jQuery
        • The Best jQuery Examples
      • Image Slideshow
      • Handle Timezones
      • Text to Speech with Javascript
      • Building Blog for Your Portfolio
      • Responsive Website Layout
      • Maths Homework Generator
  • Books
    • Finance
      • Python for Finance (O'Reilly)
    • Website
      • Hugo
        • Go Bootcamp
        • Hugo in Action.
          • About this MEAP
          • Welcome
          • 1. The JAM stack with Hugo
          • 2. Live in 30 minutes
          • 3. Using Markup for content
          • 4. Content Management with Hugo
          • 5. Custom Pages and Customized Content
          • 6. Structuring web pages
          • A Appendix A.
          • B Appendix B.
          • C Appendix C.
    • Python
      • ภาษาไพธอนเบื้องต้น
      • Python Cheatsheet
        • Python Cheatsheet
      • Beginning Python
      • IPython Cookbook
      • The Quick Python Book
        • Case study
        • Part 1. Starting out
          • 1. About Python
          • 2. Getting started
          • 3. The Quick Python overview
        • Part 2. The essentials
          • 14. Exceptions
          • 13. Reading and writing files
          • 12. Using the filesystem
          • 11. Python programs
          • 10. Modules and scoping rules
          • 9. Functions
          • 8. Control flow
          • 4. The absolute basics
          • 5. Lists, tuples, and sets
          • 6. Strings
          • 7. Dictionaries
        • Part 3. Advanced language features
          • 19. Using Python libraries
          • 18. Packages
          • 17. Data types as objects
          • 16. Regular expressions
          • 15. Classes and OOP
        • Part 4. Working with data
          • Appendix B. Exercise answers
          • Appendix A. Python’s documentation
          • 24. Exploring data
          • 23. Saving data
          • 20. Basic file wrangling
          • 21. Processing data files
          • 22. Data over the network
      • The Hitchhiker’s Guide to Python
      • A Whirlwind Tour of Python
        • 9. Defining Functions
      • Automate the Boring Stuff
        • 4. Lists
        • 5. Dictionaries
        • 12. Web Scraping
        • 13. Excel
        • 14. Google Sheets
        • 15. PDF and Word
        • 16. CSV and JSON
    • IPython
    • Pandas
      • จัดการข้อมูลด้วย pandas เบื้องต้น
      • Pandas Tutorial
  • Link Center
    • Temp
  • เทควันโด
    • รวมเทคนิค
    • Help and Documentation
  • Image
    • Logistics
Powered by GitBook
On this page
  • Machine Learning Algorithm To Predict Stock Direction
  • Goals
  • High School Math → Machine Learning
  • Background — ML Techniques
  • Let’s Get Coding
  • Selecting A Feature Set
  • Target Values
  • Visualize What We Just Did
  • Plugging In The Machine Learning
  • Backtesting
  • Summary

Was this helpful?

  1. Examples
  2. Finance

Algorithm To Predict Stock Direction

PreviousFinancial Machine LearningNextInteractive Brokers API Code

Last updated 4 years ago

Was this helpful?

Machine Learning Algorithm To Predict Stock Direction

In 2014 the Robinhood Commission-free trading app opened up for business. I eagerly signed up, put money in, and imprudently bought a few high-tech biotech stocks that caught my eye. One year later, my hastily scraped together “portfolio” was down 40%. Fast forward 4 years later, and now I set to apply quantitative techniques to determine stock price direction in order to turn a profit.

Goals

This post will teach the reader how to apply ML techniques to predict stock price direction. In addition, it will shed light on how to use the repository’s backtesting module for use with your own algorithms.

High School Math → Machine Learning

Mostly everyone in high school had some sort of class where they took observations (maybe measuring the height of a plant over time in biology class).

Here is an example of plant height data.

Let’s plot the data:

Then in class, you would be asked to add a trend line (blue dotted line). The trend line or line of best fit, tries to draw a line through the points in a way that minimizes the vertical distance (or error) between the line and each point. After you had your line, if someone asked you how tall is the plant on day 6.5, you would look at your trend line and see where the Y value intersects the 6.5 value on the trend line (~6.1).

A linear regression is fairly trivial and can even be computed by hand; however, what if we varied the amount of water the plant was given each day as well as varied the amount of sunlight each day? Now our input data would look like this:

Let’s plot the data (the axis are a little shifted):

Now if someone were to ask you what would you expect the height of the plant to be on a day 3 with .7 water and 5 light, it becomes a little more challenging; however, that’s where we can start to use some machine learning techniques. Let’s add some new vocabulary to our old thinking of statistics. Our input data points are now called features and what we’re predicting/measuring are our target values.

In our problem of predicting stock direction, we aren’t looking at something like height that can take on any number 0 to +infinity. We are looking at only two possible outcomes, either the stock goes up in a few days or goes down (binary classification). Let’s relabel our columns and change our Y_height column to only include two outcomes. Now our table looks more like it could be our stock data.

Now, our goal is to train a model where we could give it a new unseen feature set and have it predict the price direction for some future target date.

ex: [5, 0.6, 7] → 1, [4, 0.3, 8] → -1

Background — ML Techniques

Note: This module is written in Python and uses the Scikit-learn library. I highly recommend this package for anyone looking to get started with ML.

ML can be broken down into supervised and unsupervised learning. Unsupervised learning is when the feature set doesn’t come with target values, and the algorithm’s goal is to group the input data based on the different features of the input data. Imagine taking out the Y column from above and telling the computer to group the input features into a certain amount of groups. Supervised learning is when the target values are provided for each of the feature sets. This module uses supervised learning. More specifically, it solves a binary classification problem using supervised learning. We are referring to it as “Binary” since the stock does only one of two things, it goes up or down. The price can stay the same, but we’re counting this as a negative outcome in this case.

The module lets the user input their own custom feature sets, and it matches them up to a target value, +1 stock goes up -1 stock goes down, for a specified amount of days into the future. For example, you can provide the module with a moving average for the past X days, and then it will look ahead 2 days from now to determine how the stock price moved. The moving average data will be the feature set and the binary outcome (price direction up or down) will be the target value. Then, the module pipeline generates a model that can be used to predict the stock price direction on a new unseen set of data.

Let’s Get Coding

First, we need to fetch the stock data. Since Yahoo Finance no longer supports the pandas_datareader library, I switched to the Morning Star API. Pandas_datareader still has a really nice way to fetch the data.

Above is the abridged code. In the module, I have some helper functions that clean up the response from the API. For example, there is a function that removes missing data. The module only takes into account daily stock closing prices; however, you can modify it to use different types of data.

Selecting A Feature Set

One of the most important parts of any machine learning algorithm is the selection and manipulation of data into a feature set you believe is correlated with what you are trying to predict. I recently had an interesting feature set I wanted to test, hence motivating this entire project. It is a twist on a common indication known as a divergence from the mean. Stocks in an industry tend to move together. This feature set is supposed it pick up stocks that are going against industry momentum and should soon correct themselves.

  1. Manipulate stock data and put it all in terms of percent change per day. This is important in order to make sure the numbers we’re working with are all to the same scale.

2. Line up stocks and plot it as a surface so we can see it (oooh pretty colors). The X-axis is the different stock tickers lined up (sorted by market cap). The Y-axis is the days, and the Z-axis is the price change per day.

You can click to zoom and pan.

3. For each day, sum the difference between the stock of interest and the rest of the stocks in the pool. Let’s call this new column the stock’s “Slope Sum” since it sums the slope for each of the days compared to each of the stocks in the pool.

4. Run a sliding window over the Slope Sums in order to batch them together into a “feature set” for each stock. I chose to look at an 18-day sliding window.

What we’re expecting to find is that stocks that have an abnormally high or low Slope Sum batch should have a price reversal.

Target Values

Now we need to match up a target value for each 18 day batch of Slope Sums. The target value will be a -1 or 1 depending on whether or not the stock price increased or decreased on a given day into the future. I chose to look 2 days into the future.

Visualize What We Just Did

I usually find it helpful to visualize algorithms. For example, let’s take 3 stocks in an Excel file.

This example takes a Slope Sum sliding window batch of 3 and checks a target value 2 days into the future. In this case, the target value would be -1 since the stock price dropped over the next two days (blue cell → green cell).

Plugging In The Machine Learning

Now that we have our feature set and our target values associated with our feature set, let’s train a supervised learning algorithm to predict price direction based on our feature sets. I chose to go with Scikit-learn’s Support Vector Machine (SVM), but also added support for other supervised learning algorithms in the full Github repository.

First, we split the incoming data into our testing and training data. Then, we train the model and save it for future backtesting.

Backtesting

Now that we have our model, we need to provide it with new feature sets and see what price direction it predicts. For example, for the stock Facebook, we will send it the sliding window Slope Sum batches for whatever date range we are interested in. Then, the backtesting module creates a new array called a Bid Stream. The bid-stream-creator-function takes a feature set for a given day and predicts how the stock will move in the future using the model we trained. It returns a 1 or -1 depending on if it predicts the stock will go up or down respectively. This Bid Stream is then fed into the take_bid_stream_calculate_profit() function in order to determine our profit if we acted on the algorithm’s output (still assuming we used Robin Hood… no commission fees!).

Let’s see if our model can make us any 💵 💵 💵 ! In the GitHub repository, you will find that I trained the model on the top 30 or so tech stocks by market cap. In the file tests/plotting.py you can find a function “test_plot_stock()” that takes a stock symbol and plots the stock’s close prices, our algorithm’s returns, and the bid stream.

It looks like the algorithm beat the stock by $6.50 over the ~2.5-year span. If you zoom in, it looks like the algorithm correctly predicted the drop in price and avoided it. Boom!

Let’s plot another stock:

Not so hot for Adobe. We lost about $1.25 and if you zoom in it appears that the algorithm incorrectly timed the price drop.

Overall, if we sum the returns for all of the ~30 top tech stocks the algorithm comes out on top by $7.06 or by about +0.4% over a ~2.5-year period. Backtesting module for this calculation is in tests/test_backtest.py::test_on_array_of_tickers_profit().

Summary

I hope this post provided an informative overview of some ML techniques and how one could apply them to the stock market. I encourage the reader to clone the repository and experiment with your own feature sets.

Moving forward, I am going to apply the same algorithm to a portfolio of stocks and other market sectors, as well as publish some more quantitative benchmarks on the algorithm’s returns.

Please leave a comment if you would like me to embellish on any aspects of the post. It was getting a little lengthy, so I had to exclude some topics and pitfalls I ran into while building the module.

This presentation is not intended to be relied upon as advice to investors or potential investors and does not take into account the investment objectives, financial situation or needs of any investor. All investors should consider such factors in consultation with a professional advisor of their choosing when deciding if an investment is appropriate.

Reference :

https://medium.com/@jasonbamford/machine-learning-algorithm-to-predict-stock-direction-d54b7666cc7c
bballboy21/stock_surfaceMachine Learning Algorithm To Predict Stock Direction — bballboy21/stock_surfacegithub.com