<< Chapter < Page Chapter >> Page >

We pick a subset of n = 20 yearly “investible” stocks according to the aforedescribed criterion at the end of any given year (using the most recent one-year data). We then allocate our investment quantity in this portfolio, on the first trading day of the subsequent year, holding it for one year and concurrently collecting data during this year to repeat this procedure at the end of the year. We essentially keep on repeating this procedure over the period for which we want to evaluate the strategy.

It is also interesting to note that under the previous motivating rule (i.e. the Max-Median Rule ), we would always get an exact answer regarding which stocks had the single highest max-medians, in a finite, rather short, amount of time (an essentially N P -complete problem). This implied determinism, in the sense that any subsequent runs would produce the same results, amounts to a variance of zero. However, it is rather evident that our modified algorithm is inherently stochastic as we cannot evaluate all possibly imaginable combinations of portfolios. As a direct consequence, and by randomly selecting a reasonable number of portfolios for evaluation we expect to observe some natural variation, in the sense of the procedure's repeatability (each run will be essentially unique). It is possible (and rather interesting) to exploit this natural variation to assess the overall repeatability of this modified procedure.

Data summary and description

Data were obtained from the University of Pennsylvania Wharton/WRDS Repository [4]. The following data were utilized for our evaluations:

  1. S&P 500 December Constituents' GVKEY s, 1965 to 2006 ( Compustat ).
  2. S&P 500 Daily Data [including: Returns with Dividends, Share Price, Shares Outstanding, Adjustment Factors, PERMNO s ( CRSP )].
  3. Mapping Table from GVKEY s to PERMNO s.

Data were also obtained from Yahoo! Finance:

  1. Company Tickers for S&P 500 December 2007 Constituents.
  2. Index Returns for SPX (S&P 500 Market-Cap Weighted).
  3. Index Returns for SPX.EW (S&P 500 Equally Weighted, available from mid-2003 to present).

For our evaluations we note that our yearly returns with dividends were calculated from the first trading day to the last trading day per year and that dividends were included. Also the size of the data files analyzed was approximately 900MB .

Parallel processing environment and software

It is worthwhile mentioning some general details regarding the overall parallelized implementation of this procedure. It was successfully implemented using the software R , widely and freely available from the Comprehensive R Archive Network ( CRAN ). Several packages available for R , make a parallelized implementation of the algorithm very straightforward. In particular, we made use of snow (see, e.g. [5] and [6]), and snowfall (see [7]), both running over open-MPI . Some of the reasons for choosing this implementation were:

  1. Framework provides a powerful programming interface to a computational cluster (such as those available at Rice University, e.g. SUG@R and ADA ).
  2. Freely available under the Comprehensive R Archive Network ( CRAN ).
  3. Easily distributes computations of existing functions (after pertinent modifications) to various computation nodes.
  4. Excellent for embarrassingly-parallel implementations and computations.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, The art of the pfug. OpenStax CNX. Jun 05, 2013 Download for free at http://cnx.org/content/col10523/1.34
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'The art of the pfug' conversation and receive update notifications?

Ask