Diversified Portfolio of Cryptos in easy 3 Steps

We’re living in rather interesting times: blockchain technology has been around for quite some time now, still actively being in the spotlight. One of it’s most prominent applications is cryptocurrencies, and now that the hype of 2018’s ICOs and All-Time-High’s for many coins are gone, and Bitcoin volatility dropped to the lowest point since months, many individuals consider not just whether to invest into a cryptocurrency but rather into which one. Or maybe even several?

When it comes to cryptocurrency people may just associate Crypto with Bitcoin but there is a wide variety of crypto that differ in their intended use. In this exercise, we’ll do a simple portfolio diversification by textbook, and its sole purpose is to show how easy it is to perform data analysis in Python.

Step 1. Extract, Transform and Load the data

For our analysis, hourly prices are enough. We head to cryptodatadownload.com and download hourly snapshots of prices for 18 currency pairs traded on Binance exchange. Few lines of Python code allow us to load all these CSV files at once.

import pandas pd
import glob
data = glob.glob("*.csv")

For data analysis, we will use Python Data Analysis Library named pandas which operates using so-called DataFrame‘s which are basically two-dimensional tabular data structures with labeled rows and columns. Let’s take a look at a sample DataFrame we have at hand:

df = pd.read_csv(data[0], skiprows=1) # we skip first row, which has column titles
df.head(10) # print out first 10 rows

produces the following output:

XRP/BTC currency pair loaded into a pandas DataFrame from a CSV file.

Voilà! The data looks rather clean and almost ready to use: we have a data point for each hour containing opening, highest, lowest, and closing price for that hour and the traded volume. We just need to fix one thing: currently, the date is stored as a string and we’d need to transform it into date & time format and index our data based on that column:

df['datetime'] = pd.to_datetime(df['Date'], format='%Y-%m-%d %I-%p')
df.set_index('datetime', inplace=True)

Step 2. Measure Portfolio Diversification Index (PDI)

Combining different assets in a portfolio changes the return and risk characteristics. However, there is no unique quantitative measure of diversification. We will measure it using the eigenvalues of the covariance matrix of the returns of individual assets making up the combined portfolio (see formula 4 of Measuring Portfolio Diversification by Ulrich Kirchner).

To that end, we concatenate all DataFrames of the 18 currencies we have into one DataFrame as follows:

prices = []
for file in data:
    print("processing {}".format(file))
    d = pd.read_csv(file, skiprows=1)
    d['datetime'] = pd.to_datetime(d['Date'], format='%Y-%m-%d %I-%p')
    d.set_index('datetime', inplace=True)
    symbol = d['Symbol'].iloc[0]
    symbol2 = symbol[:-3] + "/" + symbol[-3:]
    df = d[['Close']]
    df.columns = [symbol2]
df_all = pd.concat(prices, axis=1)

The result has closing prices of currencies in columns, timestamps as rows, and it is aligned by time index (so if for a certain currency there was no data at that point in time, we’ll see a NaN: Not a Number).

Closing price data for 18 currencies combined in one table.

Now, we aren’t interested in absolute prices of the currencies but rather their returns. We transform all our data stored in the DataFrame to returns. This can be achieved with the following one-liner from pandas library:

df_ror = df_all.pct_change()

The covariance matrix is computed using another function call to our DataFrame:

df_cov = df_ror.cov()


Covariance matrix of returns.

To compute eigenvalues of covariance matrix we turn to scientific computing library NumPy:

from numpy.linalg import eig
values, vectors = eig(df_cov)
[8.14029476e-04 4.63245872e-04 3.98177276e-04 3.33604829e-04
 3.00425741e-04 2.39368751e-04 4.17467195e-05 2.00192368e-04
 7.72672381e-05 8.55682527e-05 9.78260310e-05 1.19778856e-04
 1.78244729e-04 1.64263750e-04 1.41537904e-04 1.47484242e-04

Now we just need to normalize them and sort

norm_values = [x / sum(values) for x in values]
sorted_values = sorted(norm_values, reverse=True)
[0.2140626914471838, 0.121818264715791, 0.10470738695992568, 0.08772697991251562, 0.0790019827539921, 0.06294602408955084, 0.05264393783305742, 0.046872438247424014, 0.04319590575990555, 0.0387834527128165, 0.03721976363749711, 0.03149785740356864, 0.025724994119409376, 0.022501605934556457, 0.02031871502073874, 0.010977999452067112, 0.0]

The end-goal of our exercise for today is then computed as follows:

PDI = sum([(k + 1) * sorted_values[k] for k in range(0,len(data))]) * 2 - 1

By using the portfolio diversification index to aid portfolio design, asset managers are able to assess whether the addition of new stocks actually helps to further diversify the portfolio and by how much.

In our case, 18 assets in our portfolio deliver the PDI of 9.7 only

And the number of assets can be halved

Step 3. PROFIT!

Just kidding. The next steps could be:

  • taking a closer look at the PDI increase with each asset being added. Then kick out those with the least PDI increase and have a potentially more diversified portfolio than the one consisting of all cryptocurrencies,
  • how to distribute the funds: the PDI doesn’t support asset weights, but they can be incorporated.
  • Also, our diversification is based solely on past returns of assets. Some thought should be put into this. Past performance is not a guarantee of future return, nor is it necessarily indicative of future performance. Keep in mind investing involves risk. The value of your investment will fluctuate over time and you may gain or lose money. This point deserves an article in its own right, be aware that past correlations do not necessarily hold perfectly (or at all) in the future, so be critical with your results.
  • Selection bias. When doing analysis on stocks constituting SP500 index today we are prone to selection bias: we look at established companies chosen by a committee to be on S&P 500 list. Similar holds for Cryptos: the top-20 cryptocurrencies by volume today might have been nowhere near the top a year or two ago (the author of this article has carried out similar analysis two years ago for top-20 coins by volume and six cryptos diversifying the portfolio the most back then were BTC, EOS, ETH, ICX, XEM, and NEO, which are at the time of writing placed at positions 1, 8, 2, 62, 26, and 22 respectively).

Code for this short analysis can be found on the GitHub repository of CBA Finance: link

Some interesting links on the topic

  • “Cryptocurrency basics — 3 key characteristics and why they matter” link
  • “Crypto Fundamental Analysis, Part II” link
  • A more verbose explanation of PDI can be found by the following link, formula 2.11


For those willing to go the extra mile, here are few lines of code to figure out which coin contributes the most to the increase of the PDI:

def computePDI(df_ror):
    df_cov = df_ror.cov()
    values, vectors = eig(df_cov)
    norm_values = [x / sum(values) for x in values]    
    sv = sorted(norm_values, reverse=True)
    PDI = sum([(k + 1) * sv[k] for k in range(0,len(df_ror.columns))]) * 2 - 1    
    return PDI
contribution = [(col, PDI - computePDI(df_ror.drop(col, axis=1))) for col in \
sorted(contribution,  key = lambda x: -x[1])   
[('ADX/BTC', 0.7307601365889358),  
('VEN/BTC', 0.7017115109515188),  
('SALT/BTC', 0.7001692470502938),  
('BTG/BTC', 0.6896511007294226),  
('XRP/BTC', 0.6413974443512025),  
('EOS/BTC', 0.615769838071694),  
('STRAT/BTC', 0.6149974150543986),  
('NEO/BTC', 0.5770995515045136),  
('ETC/BTC', 0.5021722917694369),  
('ADA/BTC', 0.49516779035836933),  
('WTC/BTC', 0.47919180030499575),  
('DASH/BTC', 0.4761748943410229),  
('XLM/BTC', 0.46674924385415295),  
('AST/BTC', 0.42159144714936225),  
('LTC/BTC', 0.3927338317640654),  
('ETH/BTC', 0.20962708226304905),  
('IOTA/BTC', -1.7763568394002505e-15)]

Adding IOTA (MIOTA) to our portfolio has a negative effect on the diversification index.

Also, Ethereum (ETH) provides the lowest contribution to diversification

Could it be due to the fact that Ethereum is a platform for many other coins and therefore acts as a (weighted) index of them? Let’s continue the discussion in the comments section!


All investment/financial opinions expressed in the article are from the personal research and experience of Dr. Yury Chebiryak and are intended as educational material. Dr. Yury Chebiryak is not a registered broker, dealer, legal or tax advisor. All opinions presented don’t necessarily represent views and analyses performed by CBA Finance AG. Although best efforts are made to ensure that all information is accurate and up to date, occasionally unintended errors and misprints may occur.
This content is intended to be used and must be used for informational purposes only. It is very important to do your own analysis before making any investment based on your own personal circumstances. You should take independent financial advice from a professional in connection with, or independently research and verify, any information that you find on this blog and wish to rely upon, whether for the purpose of making an investment decision or otherwise.

The information presented in this blog post is for personal use. Investing involves a great deal of risk, including the loss of all or a portion of your investment, as well as emotional distress. Nothing contained in this article should be construed as a warranty of investment results. All risks, losses, and costs associated with investing, including total loss of principal, are your own responsibility. It is possible that Dr. Yury Chebiryak may have a position in cryptocurrencies discussed within this blog post.

(C) 2019 Yury Chebiryak. All information provided on this website is the property of Yury Chebiryak and should not be reproduced, copied, redistributed, transferred, or sold without the prior written consent of Yury Chebiryak. All rights reserved.

Leave a Reply

Your email address will not be published. Required fields are marked *