In today’s financial markets, understanding market breadth is essential to gauge the overall market sentiment. Market breadth indicators can help investors and traders make informed decisions by analyzing the underlying strength or weakness of a market. One such indicator is the Net Advance Volume, which takes into account the number of advancing and declining stocks along with their trading volumes.
This code downloads stock data, cleans it, calculates value averages, percent changes, and signs, then groups the data by date and sign, and finally formats the data in a more readable way.
Importing Libraries:
Begin by importing the necessary libraries:
import pandas as pd
import numpy as np
import yfinance as yf
Loading the Data:
symbol_list_set50
['ADVANC.BK',
'AOT.BK',
'AWC.BK',
'BANPU.BK',
'BBL.BK',
'BDMS.BK',
'BEM.BK',
'BGRIM.BK',
'BH.BK',
'BTS.BK',
'CBG.BK',
'CENTEL.BK',
'COM7.BK',
'CPALL.BK',
'CPF.BK',
'CPN.BK',
'CRC.BK',
'DELTA.BK',
'EA.BK',
'EGCO.BK',
'GLOBAL.BK',
'GPSC.BK',
'GULF.BK',
'HMPRO.BK',
'INTUCH.BK',
'IVL.BK',
'JMART.BK',
'JMT.BK',
'KBANK.BK',
'KTB.BK',
'KTC.BK',
'LH.BK',
'MINT.BK',
'MTC.BK',
'OR.BK',
'OSP.BK',
'PTT.BK',
'PTTEP.BK',
'PTTGC.BK',
'RATCH.BK',
'SAWAD.BK',
'SCB.BK',
'SCC.BK',
'SCGP.BK',
'TIDLOR.BK',
'TISCO.BK',
'TOP.BK',
'TRUE.BK',
'TTB.BK',
'TU.BK']
Download data from the symbol ticker list:
data = yf.download(symbol_list_set50, start='2020-09-01',end='2022-09-01', interval='1d')
[*********************100%***********************] 50 of 50 completed
data
Cleaning the Data:
data.fillna(method='ffill',inplace=True)
df=data.stack()
stock_df =df.sort_index(level=1)
stock_df.index.names=['Date','Stock']
stock_df
Adj Close | Close | High | Low | Open | Volume | ||
---|---|---|---|---|---|---|---|
Date | Stock | ||||||
2020-09-01 | ADVANC.BK | 165.805130 | 182.500000 | 184.000000 | 182.500000 | 183.500000 | 3103300.0 |
2020-09-02 | ADVANC.BK | 166.713669 | 183.500000 | 184.000000 | 182.000000 | 182.500000 | 2853100.0 |
2020-09-03 | ADVANC.BK | 166.713669 | 183.500000 | 184.000000 | 182.500000 | 183.500000 | 2651600.0 |
2020-09-08 | ADVANC.BK | 164.896637 | 181.500000 | 183.500000 | 181.500000 | 182.500000 | 4256700.0 |
2020-09-09 | ADVANC.BK | 163.988083 | 180.500000 | 181.500000 | 180.000000 | 180.000000 | 2468800.0 |
… | … | … | … | … | … | … | … |
2022-08-25 | TU.BK | 16.918489 | 17.400000 | 17.700001 | 17.299999 | 17.500000 | 9456600.0 |
2022-08-26 | TU.BK | 16.821257 | 17.299999 | 17.500000 | 17.100000 | 17.400000 | 15086800.0 |
2022-08-29 | TU.BK | 16.724026 | 17.200001 | 17.400000 | 16.900000 | 17.000000 | 21703600.0 |
2022-08-30 | TU.BK | 17.015724 | 17.500000 | 17.600000 | 17.200001 | 17.200001 | 20849700.0 |
2022-08-31 | TU.BK | 17.015724 | 17.500000 | 17.600000 | 17.299999 | 17.500000 | 25821000.0 |
23304 rows × 6 columns
Calculating Value Average:
# should use average price to * volume , or vwap * average price
stock_df['value']=((stock_df['Open']+stock_df['High']+stock_df['Low']+stock_df['Close'])/4)*stock_df['Volume']
stock_df
Adj Close | Close | High | Low | Open | Volume | value | ||
---|---|---|---|---|---|---|---|---|
Date | Stock | |||||||
2020-09-01 | ADVANC.BK | 165.805130 | 182.500000 | 184.000000 | 182.500000 | 183.500000 | 3103300.0 | 5.682918e+08 |
2020-09-02 | ADVANC.BK | 166.713669 | 183.500000 | 184.000000 | 182.000000 | 182.500000 | 2853100.0 | 5.221173e+08 |
2020-09-03 | ADVANC.BK | 166.713669 | 183.500000 | 184.000000 | 182.500000 | 183.500000 | 2651600.0 | 4.862372e+08 |
2020-09-08 | ADVANC.BK | 164.896637 | 181.500000 | 183.500000 | 181.500000 | 182.500000 | 4256700.0 | 7.757836e+08 |
2020-09-09 | ADVANC.BK | 163.988083 | 180.500000 | 181.500000 | 180.000000 | 180.000000 | 2468800.0 | 4.456184e+08 |
… | … | … | … | … | … | … | … | … |
2022-08-25 | TU.BK | 16.918489 | 17.400000 | 17.700001 | 17.299999 | 17.500000 | 9456600.0 | 1.652541e+08 |
2022-08-26 | TU.BK | 16.821257 | 17.299999 | 17.500000 | 17.100000 | 17.400000 | 15086800.0 | 2.613788e+08 |
2022-08-29 | TU.BK | 16.724026 | 17.200001 | 17.400000 | 16.900000 | 17.000000 | 21703600.0 | 3.716742e+08 |
2022-08-30 | TU.BK | 17.015724 | 17.500000 | 17.600000 | 17.200001 | 17.200001 | 20849700.0 | 3.622635e+08 |
2022-08-31 | TU.BK | 17.015724 | 17.500000 | 17.600000 | 17.299999 | 17.500000 | 25821000.0 | 4.512220e+08 |
23304 rows × 7 columns
Calculating Percent Change and Sign:
stock_df['pct_change']=stock_df['Close'].pct_change()
stock_df['sign']=np.sign(stock_df['pct_change'])
stock_df
Adj Close | Close | High | Low | Open | Volume | value | pct_change | sign | ||
---|---|---|---|---|---|---|---|---|---|---|
Date | Stock | |||||||||
2020-09-01 | ADVANC.BK | 165.805130 | 182.500000 | 184.000000 | 182.500000 | 183.500000 | 3103300.0 | 5.682918e+08 | NaN | NaN |
2020-09-02 | ADVANC.BK | 166.713669 | 183.500000 | 184.000000 | 182.000000 | 182.500000 | 2853100.0 | 5.221173e+08 | 0.005479 | 1.0 |
2020-09-03 | ADVANC.BK | 166.713669 | 183.500000 | 184.000000 | 182.500000 | 183.500000 | 2651600.0 | 4.862372e+08 | 0.000000 | 0.0 |
2020-09-08 | ADVANC.BK | 164.896637 | 181.500000 | 183.500000 | 181.500000 | 182.500000 | 4256700.0 | 7.757836e+08 | -0.010899 | -1.0 |
2020-09-09 | ADVANC.BK | 163.988083 | 180.500000 | 181.500000 | 180.000000 | 180.000000 | 2468800.0 | 4.456184e+08 | -0.005510 | -1.0 |
… | … | … | … | … | … | … | … | … | … | … |
2022-08-25 | TU.BK | 16.918489 | 17.400000 | 17.700001 | 17.299999 | 17.500000 | 9456600.0 | 1.652541e+08 | 0.000000 | 0.0 |
2022-08-26 | TU.BK | 16.821257 | 17.299999 | 17.500000 | 17.100000 | 17.400000 | 15086800.0 | 2.613788e+08 | -0.005747 | -1.0 |
2022-08-29 | TU.BK | 16.724026 | 17.200001 | 17.400000 | 16.900000 | 17.000000 | 21703600.0 | 3.716742e+08 | -0.005780 | -1.0 |
2022-08-30 | TU.BK | 17.015724 | 17.500000 | 17.600000 | 17.200001 | 17.200001 | 20849700.0 | 3.622635e+08 | 0.017442 | 1.0 |
2022-08-31 | TU.BK | 17.015724 | 17.500000 | 17.600000 | 17.299999 | 17.500000 | 25821000.0 | 4.512220e+08 | 0.000000 | 0.0 |
23304 rows × 9 columns
stock_df.groupby(by=['Date','sign'])['value'].sum().to_frame()
value | ||
---|---|---|
Date | sign | |
2020-09-01 | -1.0 | 1.107695e+10 |
1.0 | 1.356994e+10 | |
2020-09-02 | -1.0 | 4.001688e+09 |
0.0 | 3.183170e+09 | |
1.0 | 1.555938e+10 | |
… | … | … |
2022-08-30 | 0.0 | 9.138420e+09 |
1.0 | 2.835648e+10 | |
2022-08-31 | -1.0 | 3.778302e+10 |
0.0 | 3.289448e+09 | |
1.0 | 2.491144e+10 |
1428 rows × 1 columns
Grouping Data and Calculating AD Volume:
ad_volume_df = stock_df.groupby(by=['Date','sign'])['value'].sum().to_frame('ad_volume').reset_index()
ad_volume_df
Date | sign | ad_volume | |
---|---|---|---|
0 | 2020-09-01 | -1.0 | 1.107695e+10 |
1 | 2020-09-01 | 1.0 | 1.356994e+10 |
2 | 2020-09-02 | -1.0 | 4.001688e+09 |
3 | 2020-09-02 | 0.0 | 3.183170e+09 |
4 | 2020-09-02 | 1.0 | 1.555938e+10 |
… | … | … | … |
1423 | 2022-08-30 | 0.0 | 9.138420e+09 |
1424 | 2022-08-30 | 1.0 | 2.835648e+10 |
1425 | 2022-08-31 | -1.0 | 3.778302e+10 |
1426 | 2022-08-31 | 0.0 | 3.289448e+09 |
1427 | 2022-08-31 | 1.0 | 2.491144e+10 |
1428 rows × 3 columns
Formatting the Data:
In this section, we format the data to make it easier to visualize and analyze. We start by pivoting the data using the ‘ad_volume’ values, and setting the ‘Date’ as the index and ‘sign’ as columns.
ad_volume_df = pd.pivot_table(ad_volume_df,values='ad_volume',index=['Date'],columns='sign')
ad_volume_df
sign | -1.0 | 0.0 | 1.0 |
---|---|---|---|
Date | |||
2020-09-01 | 1.107695e+10 | NaN | 1.356994e+10 |
2020-09-02 | 4.001688e+09 | 3.183170e+09 | 1.555938e+10 |
2020-09-03 | 1.076820e+10 | 3.479676e+09 | 8.592459e+09 |
2020-09-08 | 1.692928e+10 | 3.530786e+09 | 1.023250e+09 |
2020-09-09 | 1.138479e+10 | 3.760937e+09 | 8.628323e+09 |
… | … | … | … |
2022-08-25 | 3.004589e+09 | 6.668940e+09 | 3.409971e+10 |
2022-08-26 | 1.934971e+10 | 1.113192e+10 | 1.073419e+10 |
2022-08-29 | 3.348454e+10 | 1.236594e+09 | 8.556465e+09 |
2022-08-30 | 2.793406e+09 | 9.138420e+09 | 2.835648e+10 |
2022-08-31 | 3.778302e+10 | 3.289448e+09 | 2.491144e+10 |
480 rows × 3 columns
This creates a DataFrame ‘ad_volume_df’ with the columns -1.0, 0.0, and 1.0 representing declining volume, unchanged volume, and advancing volume, respectively.
Next, we rename the columns to make them more descriptive: ‘Vol_decline’ for -1.0, ‘Vol_unchanged’ for 0.0, and ‘Vol_advance’ for 1.0.
ad_volume_df = ad_volume_df.rename(columns={-1.0:'Vol_decline',0.0:'Vol_unchanged',1.0:'Vol_advance'})
ad_volume_df
sign | Vol_decline | Vol_unchanged | Vol_advance |
---|---|---|---|
Date | |||
2020-09-01 | 1.107695e+10 | NaN | 1.356994e+10 |
2020-09-02 | 4.001688e+09 | 3.183170e+09 | 1.555938e+10 |
2020-09-03 | 1.076820e+10 | 3.479676e+09 | 8.592459e+09 |
2020-09-08 | 1.692928e+10 | 3.530786e+09 | 1.023250e+09 |
2020-09-09 | 1.138479e+10 | 3.760937e+09 | 8.628323e+09 |
… | … | … | … |
2022-08-25 | 3.004589e+09 | 6.668940e+09 | 3.409971e+10 |
2022-08-26 | 1.934971e+10 | 1.113192e+10 | 1.073419e+10 |
2022-08-29 | 3.348454e+10 | 1.236594e+09 | 8.556465e+09 |
2022-08-30 | 2.793406e+09 | 9.138420e+09 | 2.835648e+10 |
2022-08-31 | 3.778302e+10 | 3.289448e+09 | 2.491144e+10 |
480 rows × 3 columns
We then calculate the ‘net_ad_volume’ by subtracting the declining volume from the advancing volume and add a new column ‘AD_Volume_line’, which is the cumulative sum of the ‘net_ad_volume’ column. Finally, we reset the index of the DataFrame.
ad_volume_df['net_ad_volume'] = ad_volume_df['Vol_advance']- ad_volume_df['Vol_decline']
ad_volume_df['AD_Volume_line']=ad_volume_df['net_ad_volume'].cumsum()
ad_volume_df=ad_volume_df.reset_index()
ad_volume_df
sign | Date | Vol_decline | Vol_unchanged | Vol_advance | net_ad_volume | AD_Volume_line |
---|---|---|---|---|---|---|
0 | 2020-09-01 | 1.107695e+10 | NaN | 1.356994e+10 | 2.492986e+09 | 2.492986e+09 |
1 | 2020-09-02 | 4.001688e+09 | 3.183170e+09 | 1.555938e+10 | 1.155769e+10 | 1.405068e+10 |
2 | 2020-09-03 | 1.076820e+10 | 3.479676e+09 | 8.592459e+09 | -2.175740e+09 | 1.187494e+10 |
3 | 2020-09-08 | 1.692928e+10 | 3.530786e+09 | 1.023250e+09 | -1.590603e+10 | -4.031094e+09 |
4 | 2020-09-09 | 1.138479e+10 | 3.760937e+09 | 8.628323e+09 | -2.756465e+09 | -6.787559e+09 |
… | … | … | … | … | … | … |
475 | 2022-08-25 | 3.004589e+09 | 6.668940e+09 | 3.409971e+10 | 3.109512e+10 | 9.671174e+11 |
476 | 2022-08-26 | 1.934971e+10 | 1.113192e+10 | 1.073419e+10 | -8.615519e+09 | 9.585018e+11 |
477 | 2022-08-29 | 3.348454e+10 | 1.236594e+09 | 8.556465e+09 | -2.492807e+10 | 9.335738e+11 |
478 | 2022-08-30 | 2.793406e+09 | 9.138420e+09 | 2.835648e+10 | 2.556308e+10 | 9.591368e+11 |
479 | 2022-08-31 | 3.778302e+10 | 3.289448e+09 | 2.491144e+10 | -1.287158e+10 | 9.462653e+11 |
480 rows × 6 columns
Now, we create a bar chart using the Plotly library to visualize the Net Advance Volume over time.
fig = px.bar(ad_volume_df,y='net_ad_volume',
x='Date',template='simple_white',
barmode='relative',
title='Market Breadth : Net Advance Volume')
fig.update_xaxes(rangebreaks=[dict(bounds=['sat','mo'])])
fig.update_xaxes(rangeslider_visible=True)
fig.show()
visualize the Advance-Decline Volume data to understand market trends. By creating bar charts, line charts, and bubble charts (if needed), we can analyze the market breadth and make more informed decisions based on the Net Advance Volume. This analysis can be a valuable tool for traders and investors to gauge market sentiment and identify potential opportunities.
Print the index of the data DataFrame, showing the dates included in the dataset.
data.index
DatetimeIndex(['2020-09-01', '2020-09-02', '2020-09-03', '2020-09-08',
'2020-09-09', '2020-09-10', '2020-09-11', '2020-09-14',
'2020-09-15', '2020-09-16',
...
'2022-08-18', '2022-08-19', '2022-08-22', '2022-08-23',
'2022-08-24', '2022-08-25', '2022-08-26', '2022-08-29',
'2022-08-30', '2022-08-31'],
dtype='datetime64[ns]', name='Date', length=480, freq=None)
Print the ‘Date’ column of the ad_volume_df DataFrame, showing the dates included in this dataset.
ad_volume_df['Date']
# check you date range
0 2020-09-01
1 2020-09-02
2 2020-09-03
3 2020-09-08
4 2020-09-09
...
475 2022-08-25
476 2022-08-26
477 2022-08-29
478 2022-08-30
479 2022-08-31
Name: Date, Length: 480, dtype: datetime64[ns]
Calculate the difference between a complete date range from ‘2020-09-01’ to ‘2022-08-31’ and the dates in data.index. This will show the dates without data.
pd.date_range(start='2020-09-01',end='2022-08-31').difference(data.index)
# will show date with out data
DatetimeIndex(['2020-09-04', '2020-09-05', '2020-09-06', '2020-09-07',
'2020-09-12', '2020-09-13', '2020-09-19', '2020-09-20',
'2020-09-26', '2020-09-27',
...
'2022-07-31', '2022-08-06', '2022-08-07', '2022-08-12',
'2022-08-13', '2022-08-14', '2022-08-20', '2022-08-21',
'2022-08-27', '2022-08-28'],
dtype='datetime64[ns]', length=250, freq=None)
Calculate the difference between a complete date range from ‘2020-09-01’ to ‘2022-08-31’ and the dates in ad_volume_df[‘Date’]. Store the result in a variable called df_break.
df_break=pd.date_range(start='2020-09-01',end='2022-08-31').difference(ad_volume_df['Date'])
Create a line chart using the Plotly library to visualize the ‘AD_Volume_line’ column of the ad_volume_df DataFrame. Update the x-axis range breaks using the df_break variable.
fig = px.line(ad_volume_df,y='AD_Volume_line',
x='Date',template='simple_white',
title='Market Breadth : Net Advance Volume')
fig.update_xaxes(rangebreaks=[dict(values=df_break)])
fig.update_xaxes(rangeslider_visible=True)
fig.show()
Extract a specific date range from the ad_volume_df DataFrame (between ‘2021-11-25’ and ‘2021-12-05’) by setting the ‘Date’ as the index and using the loc function.
ad_volume_df.set_index('Date').loc['2021-11-25':'2021-12-05']
sign | Vol_decline | Vol_unchanged | Vol_advance | net_ad_volume | AD_Volume_line |
---|---|---|---|---|---|
Date | |||||
2021-11-25 | 1.511991e+10 | 6.928903e+09 | 1.083175e+10 | -4.288164e+09 | 5.730919e+11 |
2021-11-26 | 6.456038e+10 | 6.928903e+09 | 1.083175e+10 | -4.288164e+09 | 5.730919e+11 |
2021-11-29 | 5.391466e+10 | 1.317914e+09 | 1.009600e+10 | -4.381867e+10 | 5.292732e+11 |
2021-11-30 | 7.962636e+10 | 1.054784e+10 | 1.238412e+10 | -6.724224e+10 | 4.620310e+11 |
2021-12-01 | 5.496541e+09 | 6.043594e+09 | 3.955037e+10 | 3.405382e+10 | 4.960848e+11 |
2021-12-02 | 1.576617e+10 | 4.467534e+09 | 1.239807e+10 | -3.368106e+09 | 4.927167e+11 |
2021-12-03 | 9.278674e+09 | 8.349072e+09 | 5.375706e+09 | -3.902968e+09 | 4.888137e+11 |
Create a line chart using the Plotly library to visualize the ‘AD_Volume_line’ column of the ad_volume_df DataFrame for the specific date range mentioned earlier. Update the x-axis range breaks using the df_break variable.
fig = px.line(ad_volume_df,y='AD_Volume_line',
x='Date',template='simple_white',
title='Market Breadth : Net Advance Volume')
fig.update_xaxes(rangebreaks=[dict(values=df_break)])
fig.update_xaxes(rangeslider_visible=True)
fig.show()