Chris Malec Data Scientist

Numpy Neural Nets

Another Numpy Neural Net

There’s a number of resources on creating a neural net “from scratch” out there. I put the from scratch in quotes because numpy is a lot of machinery to start with, but I think it’s a fair label because numpy looks about as close to the abstract math that powers these things as you can get. My goal here is to demonstrate development fundamentals I’ve picked up as a scientist-turned-dev, and another was to see just how much co-pilot could do for me on a task that I know is well represented in the training set. The answer short answer is, quite a bit, but you really need to know what you’re doing or it’ll get you into trouble fast.

The Bedrock of Deep Learning: Differentiable Functions

A well known but under-appreciated fact about most machine learning models is that the very important loss/activation functions have to be differentiable. So I could have made an Activation class and a Loss class first, but for pedagogical reasons, I made a DifferentiableFunction class first. The first subclass is of course our friend softmax. I just typed in the class names, and tabbed my way to success on these, including type hints and function signatures for the callables (highly recommended).

import numpy as np
from typing import Callable
class DifferentiableFunction:
    function: Callable[[np.ndarray], np.ndarray]
    derivative: Callable[[np.ndarray], np.ndarray]

    def __init__(self, function: Callable[[np.ndarray], np.ndarray], derivative: Callable[[np.ndarray], np.ndarray]):
        self.function = function
        self.derivative = derivative

class SoftMax(DifferentiableFunction):
    def __init__(self):
        def softmax(x: np.ndarray) -> np.ndarray:
            e_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
            return e_x / e_x.sum(axis=-1, keepdims=True)

        def softmax_derivative(x: np.ndarray) -> np.ndarray:
            s = softmax(x)
            return s * (1 - s)

        super().__init__(softmax, softmax_derivative)

the super that appears in SoftMax calls the method from the parent class, so after defining the softmax function and its derivative, the class instantiates an instance of DifferentiableFunction with those two functions.

I also tried to get copilot to write me some tests, which for this task were pretty all right, but testing numerical functions is very straightforward in that you can give it some inputs (though I should include 0) and see if it outputs what it should. Also, though I see plenty of devs with more years under their belt than me do this, I feel like it could have made a fixture to house the test data since it’s going to be used over and over again. However, there is something to be said for having it right there in the test.

import numpy as np
from DifferentiableFunction import DifferentiableFunction, SoftMax

def test_softmax_function():
    sm = SoftMax()
    x = np.array([1.0, 2.0, 3.0])
    result = sm.function(x)
    expected = np.exp(x - np.max(x)) / np.sum(np.exp(x - np.max(x)))
    np.testing.assert_allclose(result, expected, rtol=1e-6)

def test_softmax_derivative_shape():
    sm = SoftMax()
    x = np.array([1.0, 2.0, 3.0])
    deriv = sm.derivative(x)
    assert deriv.shape == x.shape

Another note on the tests, this is an area where you need to pay for the better copilot plan to get more than a tiny number of tests written for you, but I might. Even though as we will see later, the tests become less impressive as our tasks become more abstract, the fact that the computer did the gruntwork of writing some test functions and later a test class, including throwing some sample data in there is really nice. It needs to be done, it’s not hard to do, and I hate doing it, so fine Microsoft, take my money.

In the next section, we will make a Layer class!

IEX Order Book - Reading the Data

The IEX Order Book - Reading the Data

I recently gave a talk to the Python Frederick Meetup about Pandas and wanted to show some financial data since that’s what I’ve been working with lately. Specifically, I wanted something that had real time quotes and trades, but at best most sources give you the high, low, and median price aggregated over something like an hour or a day. There are a lot of sources of market data, and they have most definitely put a price on that data, a high price.

NYSE TAQ is about \$1000/month for the first 12 months and $500/month to go farther back in time. NASDAQ TotalView and similar products have a pretty complex fee structure, so I’m not entirely sure how much it costs. You can at least see the real time data on your RobinHood (Gold account) or Charles Schwab account, but historical data’s going to cost you.

Which brings us to the Investor’s Exchange, or IEX. Featured in the Michael Lewis book Flash Boys, they do their best to offer access to the market with protections in place that favor long-term over short-term strategies, keeping in mind ‘long term’ might still mean only holding a stock for a millisecond. They’re a pretty small compared to NYSE or NASDAQ, but they allow you to download pretty detailed historical data for free…as a dump of network packets.

What is an order book?

Why on earth do I want to see the order book in the first place? What is it? Every time a broker wants to make a trade, they put an an order (I want to buy 200 shares of stock A for $100 each). Until the order is canceled, or a seller is found that is willing to meet the buyer at their price, the order ‘rests on the book.’ Most heavily traded stocks do not have orders that rest long, unless the number of shares is exceptionally large or the price point is completely unreasonable. Here’s an extremely simplified view of what an order book might look like.

time side quantity price
9:30 B 100 45.00
9:31 B 200 45.01
9:31 S 100 45.05
9:32 B 50 45.00

What do you get from the IEX feed?

In the DEEP data product, you get a number of message types, notably, when there is a trade, and when there is a price level update. This isn’t quite the order book, in that there could be many orders combined together to create one price level update, but you do see how many shares have a bid (I will buy your stock) or offer (I want to sell you my stock) out at each dollar amount, which is pretty darn good for free. Important to note again, IEX is only a few percent of the total market volume, so you’re not seeing the whole market, but you get a neat view that you would typically have to pay a lot to see. Here, you only pay with parsing pain.

!xxd -l 112 -s 20000 ~/Downloads/data_feeds_20210122_20210122_IEXTP1_DEEP1.0.pcap
00004e20: 7cb9 0500 ed64 2f58 e805 0000 e805 0000  |....d/X........
00004e30: 0100 5e57 1504 b859 9ff9 2d53 0800 4500  ..^W...Y..-S..E.
00004e40: 05da e340 4000 4011 9f90 17e2 9b84 e9d7  ...@@.@.........
00004e50: 1504 288a 288a 05c6 0402 0100 0480 0100  ..(.(...........
00004e60: 0000 0000 d948 9605 4200 fc42 0000 0000  .....H..B..B....
00004e70: 0000 1903 0000 0000 0000 f76d ff74 b88d  ...........m.t..
00004e80: 5c16 1600 4854 c51f ff74 b88d 5c16 4149  \...HT...t..\.AI

And that’s what you get. A lot of hex digits and a specifications document. Fortunately, there is a super cool person on the internet who made a parser that can get all this data into a json format based on the specifications document.

#A lovely command line conversion of the binary to a json file
!pip install iex_parser
!iex_to_json -i iex_file.pcap.gz -o iex_file.json.gz -t 'GME' -s
import json
with open('iex_deep_quotes_and_trades.json') as file:
    line = file.readline()
    print(line[:500])
[{"type":"trade_report","event":null,"timestamp":"2021-01-22T13:02:31.215300+00:00","status":null,"symbol":"GME","detail":null,"halt_status":null,"reason":null,"flags":96.0,"size":39.0,"price":45.19,"trade_id":2067095.0,"side":null,"security_event":null},{"type":"trade_report","event":null,"timestamp":"2021-01-22T13:08:14.700160+00:00","status":null,"symbol":"GME","detail":null,"halt_status":null,"reason":null,"flags":96.0,"size":50.0,"price":44.87,"trade_id":2639914.0,"side":null,"security_even

It’s still hard to understand without reading the documentation and a little knowledge of how order books work, but it’s a heck of a lot more readable, and easy to turn into a dataframe.

#Read in json file, many json files that have a schema without 
#a lot of nesting and variation can be read safely as records
from pandas.io.json import read_json

json_df = read_json('iex_deep_quotes_and_trades.json',orient='records')
display(json_df)
type event timestamp ... trade_id side security_event
0 trade_report NaN 2021-01-22 13:02:31.215300+00:00 ... 2.067095e+06 None NaN
1 trade_report NaN 2021-01-22 13:08:14.700160+00:00 ... 2.639914e+06 None NaN
2 trade_report NaN 2021-01-22 13:11:52.294756+00:00 ... 3.063945e+06 None NaN
3 trade_report NaN 2021-01-22 13:18:22.383301+00:00 ... 3.669247e+06 None NaN
4 trade_report NaN 2021-01-22 13:19:09.002873+00:00 ... 3.739332e+06 None NaN
... ... ... ... ... ... ... ...
490763 price_level_update NaN 2021-01-28 21:59:06.793899+00:00 ... NaN S NaN
490764 price_level_update NaN 2021-01-28 21:59:06.797557+00:00 ... NaN S NaN
490765 trade_report NaN 2021-01-28 21:59:08.241677+00:00 ... 3.027362e+09 None NaN
490766 price_level_update NaN 2021-01-28 21:59:08.241677+00:00 ... NaN S NaN
490767 price_level_update NaN 2021-01-28 22:00:00.018339+00:00 ... NaN S NaN

490768 rows × 14 columns

#see what the na situation is
json_df.isna().sum()
type                   0
event             490768
timestamp              0
status            490768
symbol                 0
                   ...  
size                   0
price                  0
trade_id          356056
side              134712
security_event    490768
Length: 14, dtype: int64
#get rid of columns that are entirely null
json_df = json_df.dropna(axis = 1,how='all')
display(json_df)
type timestamp symbol ... price trade_id side
0 trade_report 2021-01-22 13:02:31.215300+00:00 GME ... 45.19 2.067095e+06 None
1 trade_report 2021-01-22 13:08:14.700160+00:00 GME ... 44.87 2.639914e+06 None
2 trade_report 2021-01-22 13:11:52.294756+00:00 GME ... 44.58 3.063945e+06 None
3 trade_report 2021-01-22 13:18:22.383301+00:00 GME ... 44.04 3.669247e+06 None
4 trade_report 2021-01-22 13:19:09.002873+00:00 GME ... 43.78 3.739332e+06 None
... ... ... ... ... ... ... ...
490763 price_level_update 2021-01-28 21:59:06.793899+00:00 GME ... 262.00 NaN S
490764 price_level_update 2021-01-28 21:59:06.797557+00:00 GME ... 262.00 NaN S
490765 trade_report 2021-01-28 21:59:08.241677+00:00 GME ... 262.00 3.027362e+09 None
490766 price_level_update 2021-01-28 21:59:08.241677+00:00 GME ... 262.00 NaN S
490767 price_level_update 2021-01-28 22:00:00.018339+00:00 GME ... 988.00 NaN S

490768 rows × 8 columns

#What are we left with?
json_df.isna().sum()
type              0
timestamp         0
symbol            0
flags             0
size              0
price             0
trade_id     356056
side         134712
dtype: int64
#What data types are we working with?
json_df.dtypes
type                      object
timestamp    datetime64[ns, UTC]
symbol                    object
flags                      int64
size                       int64
price                    float64
trade_id                 float64
side                      object
dtype: object
#The objects really should be strings
json_df = json_df.astype({'type':'string','symbol':'string','side':'string'})
json_df.dtypes
type                      string
timestamp    datetime64[ns, UTC]
symbol                    string
flags                      int64
size                       int64
price                    float64
trade_id                 float64
side                      string
dtype: object
#Fill in nulls on the side, since that may cause trouble plotting trades
#create a date column for filtering purposes, and change from UTC to EST
from datetime import timezone
json_df = json_df.fillna({'side':'X'})#replace nulls

json_df['date'] = (json_df
                   .apply({'timestamp':lambda x: x.date}))

json_df['timestamp'] = (json_df['timestamp']
                        .apply(lambda x: x.astimezone(tz='EST')
                               .replace(tzinfo=None)))#change to local time

display(json_df)
type timestamp symbol ... trade_id side date
0 trade_report 2021-01-22 08:02:31.215300 GME ... 2.067095e+06 X 2021-01-22
1 trade_report 2021-01-22 08:08:14.700160 GME ... 2.639914e+06 X 2021-01-22
2 trade_report 2021-01-22 08:11:52.294756 GME ... 3.063945e+06 X 2021-01-22
3 trade_report 2021-01-22 08:18:22.383301 GME ... 3.669247e+06 X 2021-01-22
4 trade_report 2021-01-22 08:19:09.002873 GME ... 3.739332e+06 X 2021-01-22
... ... ... ... ... ... ... ...
490763 price_level_update 2021-01-28 16:59:06.793899 GME ... NaN S 2021-01-28
490764 price_level_update 2021-01-28 16:59:06.797557 GME ... NaN S 2021-01-28
490765 trade_report 2021-01-28 16:59:08.241677 GME ... 3.027362e+09 X 2021-01-28
490766 price_level_update 2021-01-28 16:59:08.241677 GME ... NaN S 2021-01-28
490767 price_level_update 2021-01-28 17:00:00.018339 GME ... NaN S 2021-01-28

490768 rows × 9 columns

#Create subset that is just the order book updates
mask = json_df['type']=='price_level_update'
select_cols = ['timestamp','size','price','side']
sort_cols = ['timestamp','price']
order_df = json_df.loc[mask,select_cols].sort_values(sort_cols)

I’m not sure if this is the most effective way to accomplish the task, but I felt like making a class to do the data manipulation necessary to carry the information forward from one row to the next. The goal is for each timestep to specify the correct number of shares at each price level at each point in time. This basically accumulates all the different price levels forward in time. If there were only a few price levels, breaking them out into their own columns and doing a cumulative sum would be the most effective. However:

print('There are ',len(order_df.price.unique()),' price levels.')
There are  11848  price levels.
class PriceLevels(dict):
    def ignore_item(self,item):
        return self
    
    def add_or_discard(self,size,price,side,quote_side):
        if (size > 0)&(side==quote_side):
            self.update({price:size})
        elif (size == 0)&(side==quote_side):
            self.pop(price)
        else:
            self.ignore_item
        return self
    
    def get_bbo(self,side):
        if (side == 'B')&(len(self)>0):
            return max(self.keys())
        elif (side == 'S')&(len(self)>0):
            return min(self.keys())
        else:
            return None
    
    def get_vwap(self):
        if len(self)==0:
            return None
        volume = self
        return sum([k*v for k,v in self.items()])/sum([v for v in self.values()])
            
    def update_prices(self,size,price,side,quote_side):
        self.add_or_discard(size,price,side,quote_side)
        return PriceLevels(self.copy())
#Some examples of how the object works
a = PriceLevels({45.45:100,50:100,55:50})
print('set price levels: ',a)
a.update_prices(100,46.05,'B','B')
print('update price level same side: ',a)
a.update_prices(100,46.10,'S','B')
print('update price level opposite side: ',a)
a.update_prices(0,50,'B','B')
print('remove price same side: ',a)
print('get the best bid: ',a.get_bbo('B'))
print('get vwap: ',a.get_vwap())
set price levels:  {45.45: 100, 50: 100, 55: 50}
update price level same side:  {45.45: 100, 50: 100, 55: 50, 46.05: 100}
update price level opposite side:  {45.45: 100, 50: 100, 55: 50, 46.05: 100}
remove price same side:  {45.45: 100, 55: 50, 46.05: 100}
get the best bid:  55
get vwap:  47.6
#For each timestamp, find current sizes of each available price, the best bid and offer,
#as well as the VWAP (value weighted average price) or the buy orders, sell orders, and all orders
bid = PriceLevels()
ofr = PriceLevels()
quotes = dict()
#Use iterrows()
for row in order_df.iterrows():
    timestamp,size,price,side = row[1]
    quotes[timestamp] = {'bid':bid.update_prices(size,price,side,'B'),
                         'ofr':ofr.update_prices(size,price,side,'S'),
                         'best_bid':bid.get_bbo('B'),
                         'bid_vwap':bid.get_vwap(),
                         'best_ofr':ofr.get_bbo('S'),
                         'ofr_vwap':ofr.get_vwap(),
                         'avg_vwap':PriceLevels({**bid,**ofr}).get_vwap(),
                         'date':timestamp.date()}
list(quotes.items())[500:504]
[(Timestamp('2021-01-22 09:31:48.205629'),
  {'bid': {24.98: 100},
   'ofr': {43.61: 300},
   'best_bid': 24.98,
   'bid_vwap': 24.98,
   'best_ofr': 43.61,
   'ofr_vwap': 43.61,
   'avg_vwap': 38.9525,
   'date': datetime.date(2021, 1, 22)}),
 (Timestamp('2021-01-22 09:31:48.320642'),
  {'bid': {24.98: 100},
   'ofr': {43.61: 300, 43.59: 300},
   'best_bid': 24.98,
   'bid_vwap': 24.98,
   'best_ofr': 43.59,
   'ofr_vwap': 43.6,
   'avg_vwap': 40.94,
   'date': datetime.date(2021, 1, 22)}),
 (Timestamp('2021-01-22 09:31:48.320661'),
  {'bid': {24.98: 100},
   'ofr': {43.59: 300},
   'best_bid': 24.98,
   'bid_vwap': 24.98,
   'best_ofr': 43.59,
   'ofr_vwap': 43.59,
   'avg_vwap': 38.93750000000001,
   'date': datetime.date(2021, 1, 22)}),
 (Timestamp('2021-01-22 09:31:48.325218'),
  {'bid': {24.98: 100},
   'ofr': {43.6: 300},
   'best_bid': 24.98,
   'bid_vwap': 24.98,
   'best_ofr': 43.6,
   'ofr_vwap': 43.6,
   'avg_vwap': 38.945,
   'date': datetime.date(2021, 1, 22)})]
#Create a dataframe of current in the order book
quote_df = (pd.DataFrame
            .from_dict(quotes,orient='index')
            .reset_index()
            .rename(columns={'index':'timestamp'})
            .dropna(subset=['best_bid','best_ofr'],how='all'))

display(quote_df)
timestamp bid ofr ... ofr_vwap avg_vwap date
0 2021-01-22 08:25:48.218283 {44.45: 1000} {} ... NaN 44.45000 2021-01-22
2 2021-01-22 08:44:23.578790 {44.04: 1000} {} ... NaN 44.04000 2021-01-22
4 2021-01-22 08:49:02.526710 {} {43.66: 100} ... 43.66000 43.66000 2021-01-22
6 2021-01-22 08:49:28.189795 {43.34: 100} {} ... NaN 43.34000 2021-01-22
8 2021-01-22 08:51:11.786607 {43.41: 100} {} ... NaN 43.41000 2021-01-22
... ... ... ... ... ... ... ...
342613 2021-01-28 16:56:26.486624 {} {988.0: 112} ... 988.00000 988.00000 2021-01-28
342614 2021-01-28 16:59:06.409716 {} {988.0: 112, 262.0: 100} ... 645.54717 645.54717 2021-01-28
342615 2021-01-28 16:59:06.793899 {} {988.0: 112} ... 988.00000 988.00000 2021-01-28
342616 2021-01-28 16:59:06.797557 {} {988.0: 112, 262.0: 100} ... 645.54717 645.54717 2021-01-28
342617 2021-01-28 16:59:08.241677 {} {988.0: 112} ... 988.00000 988.00000 2021-01-28

342023 rows × 9 columns

#Get just the trades from the original dataframe
trade_df = json_df.loc[json_df['type']=='trade_report',:]
trade_df
type timestamp symbol ... trade_id side date
0 trade_report 2021-01-22 08:02:31.215300 GME ... 2.067095e+06 X 2021-01-22
1 trade_report 2021-01-22 08:08:14.700160 GME ... 2.639914e+06 X 2021-01-22
2 trade_report 2021-01-22 08:11:52.294756 GME ... 3.063945e+06 X 2021-01-22
3 trade_report 2021-01-22 08:18:22.383301 GME ... 3.669247e+06 X 2021-01-22
4 trade_report 2021-01-22 08:19:09.002873 GME ... 3.739332e+06 X 2021-01-22
... ... ... ... ... ... ... ...
490754 trade_report 2021-01-28 16:54:05.085758 GME ... 3.026635e+09 X 2021-01-28
490755 trade_report 2021-01-28 16:54:05.592947 GME ... 3.026636e+09 X 2021-01-28
490759 trade_report 2021-01-28 16:56:26.486624 GME ... 3.026962e+09 X 2021-01-28
490762 trade_report 2021-01-28 16:59:06.793899 GME ... 3.027353e+09 X 2021-01-28
490765 trade_report 2021-01-28 16:59:08.241677 GME ... 3.027362e+09 X 2021-01-28

134712 rows × 9 columns

#Combine the two dataframes so that each trade report is associated 
#with a set of 'prevailing quotes.'
plot_df = (pd.concat([quote_df,trade_df],ignore_index=False)
             .fillna({'type':'quote','symbol':'GME'})
             .sort_values(['timestamp','type'])
             .drop('side',axis=1)
             .ffill()
             .dropna(subset=['best_bid','best_ofr'])
             .astype({'date':'string'}))

plot_df = plot_df.loc[plot_df['type']=='trade_report',:]

plot_df['midpt'] = (plot_df['best_bid']+plot_df['best_ofr'])/2

display(plot_df.loc[:,['timestamp','price','best_bid','best_ofr','midpt']])
timestamp price best_bid best_ofr midpt
27 2021-01-22 08:53:12.048708 43.85 43.55 43.66 43.605
28 2021-01-22 08:54:24.385311 44.09 43.55 43.66 43.605
31 2021-01-22 08:57:36.794024 44.22 43.55 44.30 43.925
32 2021-01-22 08:57:39.852032 44.20 43.55 44.30 43.925
33 2021-01-22 08:57:41.975903 44.28 43.55 44.30 43.925
... ... ... ... ... ...
490754 2021-01-28 16:54:05.085758 257.58 257.58 988.00 622.790
490755 2021-01-28 16:54:05.592947 257.58 257.58 988.00 622.790
490759 2021-01-28 16:56:26.486624 264.00 257.58 988.00 622.790
490762 2021-01-28 16:59:06.793899 262.00 257.58 988.00 622.790
490765 2021-01-28 16:59:08.241677 262.00 257.58 988.00 622.790

134697 rows × 5 columns

I’d like to make a special note that concatenating two dataframes and then sorting them and using the forward fill option to fill in a prevailing value is a nice trick. The logically obvious way to do this is you only wanted the trades is to do some kind of time range join, which would take waaaaaay longer. Concatenating is fast, sorting is fast, forward filling is fast, and throwing away a ton of rows is fast. Joins are slow.

Market Manipulation

Market Manipulation

I’m moving on from my time at FinRA, and thought it was a good time to reflect on some of the interesting things I’ve learned about how the stock market works.

Super Basic Market Structure: The Order Book

Every time a broker wants to make a trade, they put an an order (I want to buy 200 shares of stock A for $100 each). Until the order is canceled, or a seller is found that is willing to meet the buyer at their price, the order ‘rests on the book.’ Most heavily traded stocks do not have orders that rest long, unless the number of shares is exceptionally large or the price point is completely unreasonable.

time side quantity price
9:30 B 100 45.00
9:31 B 200 45.01
9:31 S 100 45.05
9:32 B 50 45.00

Pretty Basic Market Structure: Reg NMS

Reg NMS is an SEC rule that created the ‘National Market System.’ Stock exchanges became publicly traded companies and firms routing orders to the markets gained the obligation of ‘best execution.’ What exactly this means is fairly complicated, because it’s not just about price. But a simple explanation is that if NYSE’s order book has a stock selling at 45.05 and NASDAQ’s order book has the same stock selling at 45.04, your buy order with Charles Schwab account needs to take the shares at NASDAQ at the better price. Complications set in when the number of shares in your order are not all available at the same price.

The regulation is intended to keep brokerages from giving customers a bad deal. A couple more important terms are ‘best bid’ and ‘best offer’ that refer to the highest price anyone is willing to buy and the lowest price anyone is willing to sell respectively.

What is market manipulation?

This is relatively subjective, but the crux of many types of manipulation is ‘are you engaging with the market in good faith?’ In other words, are you placing orders onto the books because you would like to buy a stock at the price you stated (or at least hope someone will sell to you at that price), or are you sending false signals to manipulate at what prices people are willing to buy/sell.

Here is an example of a sequence of orders that includes a manipulator. The manipulator enters many sell orders to entice the other sellers to lower their prices. Keep in mind, most people can’t see that righthand column.

time side quantity price firm
9:30 B 100 45.00 A
9:31 B 200 45.01 A
9:31 S 100 45.05 B
9:32 B 50 45.00 A
9:33 S 100 45.04 Manip
9:33 S 200 45.04 Manip
9:33 S 500 45.03 Manip
9:33 S 100 45.05 Manip
9:33 S 400 45.03 Manip
9:33 S 100 45.03 Manip
9:34 S 600 45.02 IllBite
9:34 B 600 45.02 Manip

I made up a more complicated example as a .csv file, let’s see how it looks when plotted.

import pandas as pd
manip_df = pd.read_csv('example_manip.csv',
                       sep=',',
                       header=1)
display(manip_df)
Time Firm Side Price Quantity Order ID Action NBO NBB
0 10/13/2021 9:31:00.0 AM Firm S S 50.05 1000 1 Order 50.05 49.95
1 10/13/2021 9:31:00.0 AM Firm B B 49.95 1000 2 Order 50.05 49.95
2 10/13/2021 9:31:00.2 AM Firm M S 50.04 200 3 Order 50.04 49.95
3 10/13/2021 9:31:01.5 AM Firm M S 50.03 100 4 Order 50.03 49.95
4 10/13/2021 9:31:03.0 AM Firm M S 50.03 100 4 Cancel 50.04 49.95
5 10/13/2021 9:31:03.2 AM Firm M S 50.01 150 5 Order 50.01 49.95
6 10/13/2021 9:31:03.3 AM Firm M S 50.00 300 6 Order 50.00 49.95
7 10/13/2021 9:31:04.5 AM Firm M S 50.03 500 7 Order 50.00 49.95
8 10/13/2021 9:31:04.6 AM Firm M S 50.04 200 8 Order 50.00 49.95
9 10/13/2021 9:31:05.7 AM Firm M S 49.96 200 9 Order 49.96 49.95
10 10/13/2021 9:31:05.7 AM Firm M S 50.04 100 8 Cancel 49.96 49.95
11 10/13/2021 9:31:05.5 AM Firm M S 49.97 600 10 Order 49.96 49.95
12 10/13/2021 9:31:05.6 AM Firm M S 49.96 300 11 Order 49.96 49.95
13 10/13/2021 9:31:06.8 AM Firm S S 49.96 1000 12 Order 49.96 49.95
14 10/13/2021 9:31:06.9 AM Firm M B 49.96 1000 13 Order 49.96 49.96
15 10/13/2021 9:31:06.9 AM Firm M X 49.96 1000 12_13 Trade 49.96 49.95
16 10/13/2021 9:31:07.0 AM Firm M S 49.96 300 11 Cancel 49.97 49.95
17 10/13/2021 9:31:07.0 AM Firm M S 49.97 600 10 Cancel 50.00 49.95
def string_to_timestamp(string):
    timestamp_components = string.split(' ')
    timestamp_string = 'T'.join(timestamp_components[:2])
    return pd.to_datetime(timestamp_string)
manip_df['timestamp'] = manip_df['Time'].apply(string_to_timestamp)
import plotly.express as px
import plotly.graph_objects as go
fig = px.scatter(manip_df,
                 x='timestamp',
                 y='Price',
                 symbol='Action',
                 color='Firm',
                 size='Quantity',
                 color_discrete_sequence=['goldenrod','aqua','lightgreen'],
                 opacity=.5)

fig.add_trace(go.Scatter(x=manip_df['timestamp'],
                         y=manip_df['NBO'],
                         mode='lines',
                         line_color='magenta',
                         line_shape='hv',
                         name='Best Bid'))

fig.add_trace(go.Scatter(x=manip_df['timestamp'],
                         y=manip_df['NBB'],
                         mode='lines',
                         line_color='cyan',
                         line_shape='hv',
                         name='Best Offer'))
fig.update_layout(template='plotly_dark');
fig.write_html("manip_example.html")
fig.show()

In this example a Firm S is selling the stock and a Firm B is buying. The ‘spread’ or difference between best bid and best ask is 10 cents. The midpoint price, or what is often referred to as the ‘price’ if buyers and sellers met in the middle is \$50.

A Firm M would like to buy this stock, but they would like to buy closer to the bid than the ask, or even closer to the bid than the midpoint. What Firm M does is enter a bunch of sell orders to make it look like a lot of people suddenly want to sell this stock. Keep in mind that Firm S and Firm B cannot see who is putting in the orders, just that there are suddenly new orders at several different price points.

The orders are placed so that the best offer becomes progressively lower. Firm S, noting the sell pressure, panics that they may get stuck with an even lower price if they wait around for their original order of \$50.05, and so they put an offer of $49.96, or the current best offer. Now the spread is just a penny, but Firm M won’t go any lower, because their sell order would trade with Firm B, and remember they don’t actually want to sell the stock.

When Firm M sees a sell order at a favorable price appear, they immediately put in a buy order and trade with Firm S, and then cancel all their other orders. They may even place some cancels beforehand to make the whole thing look more believable. Ultimately, Firm M will pull out quickly since they do not actually want to trade with anyone buying stock at their artificially low prices. The more believable they make their ‘non-bona fide’ quotes, the more real risk they take on. At some point, if their quotes could not be distinguished from normal activity, they would probably be trading against buyers looking for good prices and end up acting like a real trader in spite of themselves.

Though in this example Firm M got a 9 cent improvement on their buy order of 1000 shares, they netted \$90, which doesn’t sound like much, especially considering they put nearly \$50,000 on the line to do it. However, for one thing, they can turn around and reverse the whole scheme, netting \$90 the other way and keep going back and forth. For another thing, Firm M accomplished this in 7s. Though this is fabricated data, 7s is actually a pretty long time in the stock market, so it is a realistic time scale for a manipulation like this. The wage for this scheme is then \$46,285.71 per hour.

The less realistic aspect of the data is the large spread of 10 cents, though it’s not unreasonable for a lightly traded stock. A lightly traded stock would also have a smaller number of firms trading, allowing Firm M to dominate the market more easily. Firm M would probably have to jump around different symbols since traders generally notice when their fills aren’t going so hot and may stop treating Firm M’s orders as believable, even though they wouldn’t be able to prove that all the orders were from the same party.

In short, if Firm M really was trying to repeat this back and forth constantly all day, someone would notice, and may report them to FinRA. However, you can see that someone can make very good money by cheating. I hear traders who don’t cheat also make very good money, so I’d reccommend the honest strategy.

As a final plug for honesty, just imagine if absolutely everyone was trading purely in a ‘zero-sum’ game mentality where the point is to take as much money for yourself as possible from everyone else’s pile. If there is ‘a point’ to the capital markets it is that companies that have a higher chance of succeeding, or that would benefit the most from a capital infusion, get money to grow their business. If no one is thinking about the value of owning a stock, collecting dividends, voting with shareholders, collecting premiums from those who would like to borrow the stock in short and options trades; then any rationality that may exist in the market is gone. Maybe I’m holding on to too much belief in markets as price discovery engines, but without fair markets where traders can basically trust the intentions of the other participants, our 401k’s are just funding a money game that produces little to no value.

The Stardew Valley Grange Display - Application

Stardew Valley Grange Display - Application

Now that I’ve got an ‘ideal’ display made up, and I made lofty claims that I had a lot of it on hand, let’s see the reality. I want to look at my save file to see what 8 point items I have lying around and which I need to actually aquire. In my current game, I just got the letter from the mayor saying the fair is in one week, so if I bother to figure this out now, I’ve got 125 points in the bag.

I’ll be using lxml’s etree (element tree) to navigate the xml of the save file. I’ve put variables related to my save file’s location into a notebook. I’m not too concerned about you all knowing my Farm’s name and id, but it’s a nice trick for other times when I might want to avoid showing various paths/names/etc from my computer.

%run MyVariables.ipynb
You now have access to the variables farm_name, and farm_path
#import necessary packages
from IPython.display import display, HTML
import pandas as pd
from lxml import etree as ET
#open and parse file
with open(farm_path) as file:
    tree = ET.parse(file)
root = tree.getroot()
#This is an example of the xml containing information about an item
print(ET.tostring(root.find('player').find('items').findall('Item')[12],pretty_print=True).decode())
<Item xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:type="Object">
  <isLostItem>false</isLostItem>
  <category>-16</category>
  <hasBeenInInventory>true</hasBeenInInventory>
  <name>Clay</name>
  <parentSheetIndex>330</parentSheetIndex>
  <specialItem>false</specialItem>
  <SpecialVariable>0</SpecialVariable>
  <DisplayName>Clay</DisplayName>
  <Name>Clay</Name>
  <Stack>1</Stack>
  <tileLocation>
    <X>0</X>
    <Y>0</Y>
  </tileLocation>
  <owner>0</owner>
  <type>Basic</type>
  <canBeSetDown>true</canBeSetDown>
  <canBeGrabbed>true</canBeGrabbed>
  <isHoedirt>false</isHoedirt>
  <isSpawnedObject>false</isSpawnedObject>
  <questItem>false</questItem>
  <questId>0</questId>
  <isOn>true</isOn>
  <fragility>0</fragility>
  <price>20</price>
  <edibility>-300</edibility>
  <stack>1</stack>
  <quality>0</quality>
  <bigCraftable>false</bigCraftable>
  <setOutdoors>false</setOutdoors>
  <setIndoors>false</setIndoors>
  <readyForHarvest>false</readyForHarvest>
  <showNextIndex>false</showNextIndex>
  <flipped>true</flipped>
  <hasBeenPickedUpByFarmer>true</hasBeenPickedUpByFarmer>
  <isRecipe>false</isRecipe>
  <isLamp>false</isLamp>
  <minutesUntilReady>0</minutesUntilReady>
  <boundingBox>
    <X>0</X>
    <Y>0</Y>
    <Width>64</Width>
    <Height>64</Height>
    <Location>
      <X>0</X>
      <Y>0</Y>
    </Location>
    <Size>
      <X>64</X>
      <Y>64</Y>
    </Size>
  </boundingBox>
  <scale>
    <X>0</X>
    <Y>0</Y>
  </scale>
  <uses>0</uses>
  <preservedParentSheetIndex>0</preservedParentSheetIndex>
  <destroyOvernight>false</destroyOvernight>
</Item>
#helper function to flatten lists of xml elements
def unnest_lists(list_of_lists):
    from functools import reduce
    return reduce(lambda x,y: x+y, list_of_lists,[])

#recursively finds all tags in the tags list in the order they appear, returns a list of elements
def find_tags(parents,tag_list):
    if type(parents) is not list:
        parents = [parents]
    #base case
    if len(tag_list) == 0:
        return [[child for child in parent] for parent in parents]
    #recursive case
    else:
        tag = tag_list.pop(0)
        elements = unnest_lists([element.findall(tag) for element in parents])
        return find_tags(elements,tag_list)

#reads the text within the tags listed in tags, returns None if there is no text within a tag
def read_tags(element_list,tags):
    record = []
    for element in element_list:
        for tag in tags:
            if element.tag == tag:
                text = element.text
                record.append(text)
    if len(record)<len(tags):
        diff = len(tags)-len(record)
        record += diff*[None]
    return record

#displays the tables the way I'd like
def display_table(table_df):
    return display(HTML(table_df.to_html(escape=False,index=False))) #helper function to make it display like I want

By general exploration, I found three relevant locations for the items I’m looking for: my active inventory, chests and mini-fridges, and my main fridge. The large fridge that comes with the house is apparently separate from the other chests. I also found that I could not use the object attribute ‘chest’ to identify relevant chests since this is an attribute for the shipping crate and mini-shipping crate.

#All the objects in all the game locations (Farm, Bus Stop, Ginger Island, etc)
Objects = find_tags(root,['locations','GameLocation','objects','item','value','Object'])
#Names of all the objects
Object_Names = unnest_lists([[element.text for element in Object if element.tag == 'DisplayName'] for Object in Objects])
#Narrow in on container objects
containers = ['Chest','Mini-Fridge','Junimo Chest','Stone Chest']
isContainerObject = [container in containers for container in Object_Names]
Container_Objects = unnest_lists([[Object for Object in Objects[i] if Object.tag=='items'] for i in range(len(Objects)) if isContainerObject[i]])
element_lists = find_tags(Container_Objects,['Item'])
#All the items in all your chests, stone chests, mini-fridges, and junimo chests
items_in_chests = [read_tags(element_list,['Name','price','stack','quality']) for element_list in element_lists]

#These are simpler because items falling under the <player><items>...</items></player> 
#pattern don't include a bunch of items like 'tables' and 'tv'. The same is true for the fridge items

#All the items you're holding right now
active_inventory = find_tags(root,['player','items'])
items_in_inventory = [[read_tags(item,['Name','price','stack','quality']) for item in items] for items in active_inventory]

#Turns out your fridge is its own thing and did not make the cut in chest
fridge = find_tags(root,['locations','GameLocation','fridge','items'])
items_in_fridge = [[read_tags(item,['Name','price','stack','quality']) for item in items] for items in fridge]
#Create three dataframes from our list of items
in_chests = pd.DataFrame.from_records(items_in_chests,columns = ['Name','Price','Stack','Quality'])
on_hand = pd.DataFrame.from_records(items_in_inventory[0],columns = ['Name','Price','Stack','Quality'])
in_fridge = pd.DataFrame.from_records(items_in_fridge[0],columns = ['Name','Price','Stack','Quality'])

#Concatenate the dataframes along the row axis, drop empty rows, and set the data types for each column
#Additionally, I'd like the 'stack' column to add together the total amount of each item I have in all chests
all_items = (pd.concat([in_chests,on_hand,in_fridge],axis = 0)
             .dropna()#drop rows with missing columns
             .astype({'Name':str,'Price':int,'Stack':int,'Quality':str})#force correct types on the columns
             .groupby(['Name','Price','Quality'])#groupby non-stack columns
             .sum()#sum the stacks
             .reset_index())#get the groupby columns back out of the index

display_table(all_items.loc[1:200:20,:])#Let's look at a sampling of the dataframe
Name Price Quality Stack
Algae Soup 100 0 2
Blue Jazz 50 1 2
Clay 20 0 525
Crayfish 75 0 4
Fall Seeds 45 0 30
Green Algae 15 0 157
Lucky Lunch 250 0 4
Oak Resin 150 0 47
Pumpkin Pie 385 0 1
Rice Pudding 260 0 1
#I now need the dataframe I created earlier
%run Grange_Display.ipynb
display_table(grange_df.head())
Item category Price Quality Points
Duck Egg Animal Products 95g Base 3
Duck Feather Animal Products 250g Base 4
Egg Animal Products 50g Base 2
Brown Egg Animal Products 50g Base 2
Goat Milk Animal Products 225g Base 4

So now I have a dataframe with point values for various items, and a dataframe of items I actually have. I would like to join the dataframes to find a list of 8 point items that I actually have. Since the quality columns are formatted differently in the two dataframes, I’ll have to fix that.

#Note there are some differences in how the columns are labeled, 
#most importantly quality is numeric in my save file but a string on the wiki
quality_dict = {'0':'Base','1':'Silver','2':'Gold','4':'Iridium'}#map between the two formats
all_items['Quality_str'] = all_items['Quality'].apply(lambda x: quality_dict[x])#relabel my save file quality

#Create a join between the two dataframes, drop columns from join that would be duplicated
my_grange_items = grange_df.merge(all_items.drop(['Price','Quality'],axis=1),
                                  left_on=['Item','Quality'],
                                  right_on=['Name','Quality_str'],
                                  how='inner')#Do an inner join
#display 8 point items I have on hand
display_table(my_grange_items.loc[my_grange_items.Points == 8,['Item','category','Price','Stack']])
Item category Price Stack
Large Milk Animal Products 190g 6
Rabbit's Foot Animal Products 565g 3
Cheese Artisan Goods 230g 1
Flounder Fish 100g 1
Walleye Fish 105g 1
Beet Vegetables 100g 1

I can see that I’ve got 4/6 categories with 8 point items, and sufficient iridium quality Large Milk to pad out the display to 9 items. I’m missing forage and fruit. I earlier identified coconuts and certain mushrooms as an easy way to get these.

And sure enough, I threw in an iridium quality chanterelle and coconut and got 125 points on my grange display!

125 Point Grange Display

The Stardew Valley Grange Display - Analysis

The Stardew Valley Grange Display - Analysis

According to the Stardew Valley wiki, the grange display’s score is calculated according to the formula:

14 + (# of items) + 5*(# of categories - max of 6) + (individual item scores)

So if you had 9 items in the display representing 9 categories, and 3 items had a score of 5, 4 items had a score of 7, and 2 had a score of 8 the final score would be: 14 + 9 + 5*6 + (5+5+5+7+7+7+7+8+8) = 112

To get a perfect score, you need 9 items representing at least 6 categories that are all 8 point items: 14 + 9 + 5*6 + (8+8+8+8+8+8+8+8+8) = 125

So we really need to find items that are worth 8 points. Some items are ineligible and get 0 points, they are marked on the table with a dash. This table shows us the max score available in each of the categories. Notice that not every category has an 8 point item! Since there are still 6 categories available with 8 point items, we just won’t use ‘cooking’ or ‘Minerals’.

#Find categories with no 8 point entries
grange_df['Points'] = grange_df['Points'].apply(lambda x: 0 if x == '—' else int(x))
max_in_category = grange_df.groupby('category').Points.max()
display(max_in_category)
category
Animal Products                     8
Artisan Goods                       8
Cooking                             6
Fish                                8
Foraging, Flowers, and Tree Saps    8
Fruits                              8
Minerals                            6
Vegetables                          8
Name: Points, dtype: int64

It’s also important to note that not every item in the game is actually in the wiki table, for items outside the tables there is a lookup table using sell price and quality to determine points. Here, I’m using the ‘tables’ object from the previous post.

import re
def strip_nonnumeric(string):
    return re.sub("[^0-9]", "", string)

price_df = parse_table(tables[11],get_heading=False)
price_df['Low Price'] = price_df['Sell Price'].apply(strip_nonnumeric)
price_df['High Price'] = price_df['Low Price'].shift(-1)
price_df = price_df.drop('Sell Price',axis = 1)
price_df.columns = ['Base','Silver','Gold','Iridium','Low Price','High Price']
display_table(price_df.loc[:,['Low Price','High Price','Base','Silver','Gold','Iridium']].fillna('Inf'))
Low Price High Price Base Silver Gold Iridium
0 20 1 2 3 5
20 90 2 3 4 6
90 200 3 4 5 7
200 300 4 5 6 8
300 400 5 6 6 8
400 Inf 6 6 6 8

Professions

There are several profession bonuses available in the game, and most grange display calculators take these into account, but I’m narrowing my focus to items that lead to maximum scores. So I bet I can toss a bunch of these out, by finding profession bonuses that will never result in an 8 point item. To do this, let’s check out the table for calculating points:

Spring Onion Mastery

Spring Onions have a sell price of 8g, iridium quality doubles this to be 16g, and with spring onion mastery will be 16g x 5 = 80g. That’s still only 6 points, so this won’t help us push anything up to 8.

Bear Knowledge

I love the Bear, but bear knowledge is similarly pumping up pretty low priced items. Salmonberries are even cheaper than spring onions (5g), and Bear Knowledge only gives a x3, so those will never be an 8. Blackberries are 20g, Iridium quality is 40g, and so with bear knowledge, they sell for 120g, which is still only 7 points.

Blacksmith, Gemologist, and Tapper

These are bonuses on items that can never be iridium quality (bars, minerals, and gems), so these can never be 8 points (see table).

Tiller, Artisan, Rancher, Fisher, and Angler

These all require some actual calculation. We’re looking for 8 point items, and so we can narrow our focus to iridium quality.

  • Tiller will apply to vegetables, non-foraged fruits, and flowers.
  • Artisan will apply to artisan goods
  • Rancher will apply to animal products
  • Fisher and Angler will apply to Fish

Though some items within a category will be unaffected by these bonuses (e.g. flowers in the Foraging Flowers and Tree Saps category is the only one affected by tiller) there is no cross-category bonuses, and only one category with two (Fish).

Inspecting the items that are in the table, none of the profession bonuses actually boost an item enough to make an item an 8 that was not already an 8.

import re
def strip_nonnumeric(string):
    return re.sub("[^0-9]", "", string)

def assign_points(price):
    if price >= 200:
        return 8
    elif price >= 90:
        return 7
    elif price >= 20:
        return 6
    elif price >= 0:
        return 5
    else:
        return '—'
    
bonus_dict = {'Animal Products':1.2,
              'Artisan Goods':1.4,
              'Cooking':1,
              'Fish (Fishing)':1.25,
              'Fish (Angler)':1.5,
              'Foraging, Flowers, and Tree Saps':1.1,
              'Fruits':1.1,
              'Minerals':1,
              'Vegetables':1.1}
    
df = grange_df.loc[(grange_df['Quality']=='Iridium')&(grange_df['Points']!='—'),:]
fishing_df = df.loc[df['category']=='Fish',:].copy()
fishing_df['category'] = 'Fish (Fishing)'
angler_df = df.loc[df['category']=='Fish',:].copy()
angler_df['category'] = 'Fish (Angler)'
df = pd.concat([df.loc[df['category']!='Fish',:].copy(),fishing_df,angler_df], axis = 0)
#df['Points'] = df['Points'].apply(lambda x: 0 if x == '—' else int(x))
df['Price'] = df['Price'].apply(strip_nonnumeric).astype(int)
df['Price w/ Bonus'] = (df['Price']*df['category'].apply(lambda x: bonus_dict[x])).astype(int)
df['Points w/ Bonus'] = df['Price w/ Bonus'].apply(assign_points)

boost_filter = (df['Points'].astype(int) < 8)&(df['Points w/ Bonus'] == 8)
print("Items that were less than 8 points and were boosted to 8 by profession bonuses:")
display_table(df.loc[boost_filter,:])

Items that were less than 8 points and were boosted to 8 by profession bonuses:

Item category Price Quality Points Price w/ Bonus Points w/ Bonus

In other words, none. Perhaps this is no longer true if I include items outside the wiki tables, but that’s for the next post. And if you want to calculate point values for an arbitrary display, the bonuses should obviously be included. Now I’d like to look at the cheapest, easiest way to get 8 point items in each category. For now, I’ll try looking at the items with the lowest sell price in each category.

high_score_df = grange_df.loc[(grange_df['Quality']=='Iridium')&(grange_df['Points']=='8'),:]
high_score_df['Price'] = high_score_df.Price.apply(strip_nonnumeric).astype(int)
summary_df = high_score_df.groupby('category').agg({'Item':'count','Price':min}).rename(columns={'Item':'Total Items','Price':'Minimum Sell Price'}).reset_index()
min_priced_items_df = (high_score_df
                       .merge(summary_df.loc[:,['category','Minimum Sell Price']], on = ['category']))
display_table(min_priced_items_df.loc[min_priced_items_df['Price'] == min_priced_items_df['Minimum Sell Price'],:])
<ipython-input-89-f0dd172bd0dc>:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  high_score_df['Price'] = high_score_df.Price.apply(strip_nonnumeric).astype(int)
Item category Price Quality Points Minimum Sell Price
Milk Animal Products 125 Iridium 8 125
Wine (Hot Pepper) Artisan Goods 120 Iridium 8 120
Dorado Fish 100 Iridium 8 100
Flounder Fish 100 Iridium 8 100
Largemouth Bass Fish 100 Iridium 8 100
Lionfish Fish 100 Iridium 8 100
Midnight Squid Fish 100 Iridium 8 100
Pike Fish 100 Iridium 8 100
Slimejack Fish 100 Iridium 8 100
Tuna Fish 100 Iridium 8 100
Snow Yam Foraging, Flowers, and Tree Saps 100 Iridium 8 100
Apple Fruits 100 Iridium 8 100
Coconut Fruits 100 Iridium 8 100
Orange Fruits 100 Iridium 8 100
Beet Vegetables 100 Iridium 8 100
Taro Root Vegetables 100 Iridium 8 100
  • Animal Products: Milk
  • Artisan Goods: Hot Pepper Wine
  • Fish: Dorado, Flounder, Largemouth Bass, Lionfish, Midnight Squid, Pike, Slimejack, Tuna
  • Foraging, Flowers, and Tree Saps: Snow Yam
  • Fruits: Apple, Coconut, Orange
  • Vegetable: Beet, Taro Root

A lot of these are actually pretty easy to get. Iridium quality milk you can get by the pallet even in Year 1 with a cow at full hearts. Dorado, Flounder, and Lionfish are pretty easy fish to get a perfect catch, though you need Ginger Island unlocked to get the Lionfish. Iridium Coconuts can just be picked up off the ground in the Calico Desert provided you have the Botanist profession (and you really should).

A few of these, though having a low sell price, do not meet my requirement of ‘easy’. Hot pepper wine requires a fully upgraded house, and a lot of time (or fairy dust) to get to iridium quality. Snow Yams do not benefit from botanist since you dig them from artifact spots, therefore I believe the only way to get an iridium snow yam is to use deluxe fertilizer and winter seeds (and that’s two RNG’s I don’t need). Beets and Taro Root would both require deluxe fertilizer, taro root has the advantage of not requiring watering, though by the time you have deluxe fertizlizer and Ginger island unlocked, you should have iridium sprinklers coming out your ears. Beets grow faster, so I can roll the dice more on less deluxe fertizer in the same amount of time. Since there is no other way to get an iridium vegetable, I’ll stick with the beets, the other two will need replacing. Let’s look at the fuller list in these two categories.

category_filter = high_score_df['category'].isin(['Foraging, Flowers, and Tree Saps','Artisan Goods'])
category_filter &= high_score_df['Item'].str[:4] != 'Wine'
display_table(high_score_df.loc[category_filter,:].sort_values(['category','Price']))
Item category Price Quality Points
Mayonnaise Artisan Goods 190 Iridium 8
Beer Artisan Goods 200 Iridium 8
Cheese Artisan Goods 230 Iridium 8
Pale Ale Artisan Goods 300 Iridium 8
Goat Cheese Artisan Goods 400 Iridium 8
Snow Yam Foraging, Flowers, and Tree Saps 100 Iridium 8
Poppy Foraging, Flowers, and Tree Saps 140 Iridium 8
Morel Foraging, Flowers, and Tree Saps 150 Iridium 8
Chanterelle Foraging, Flowers, and Tree Saps 160 Iridium 8
Purple Mushroom Foraging, Flowers, and Tree Saps 250 Iridium 8
Fairy Rose Foraging, Flowers, and Tree Saps 290 Iridium 8
Magma Cap Foraging, Flowers, and Tree Saps 400 Iridium 8

Beer and Pale Ale have the same issue as wine. You can get gold quality cheese pretty easy and age it to iridium in about a week, so this is probably the way to go. If you have Ginger Island unlocked and really enjoy doing Professor Snail’s bidding, iridium quality ostrich eggs are a way to get iridium quality mayo without the fully upgraded house.

Poppys and Fairy Roses would also require deluxe fertilizer, and I’d like to minimize that, as Qi coins don’t grow on trees. It’s hardly the scarcest thing in the game though, and I just got one as a drop in the dangerous mines. The mushrooms benefit from Botanist, and Morel’s/Chanterelles are available in the secret woods. Magma Cap’s can be picked up in the volcano and Purple Mushrooms are available in Professor Snail’s mushroom cave.

So that leaves me with a fairly low-stress 125 point grange display:

  • 4 Large Milk
  • 1 Lionfish
  • 1 Coconut
  • 1 Cheese
  • 1 Morel Mushroom
  • 1 Beet

All Iridium quality. 14 pts for being me + 30 pts for 6 categories + 9 pts for 9 items + 9x8 pts for items. In my current playthrough, I have not unlocked the walnut room, so I will be stuck at 124 until I get that iridium beet. The rest of these items I have lying around in chests/fridges in Winter 2. My dream of a perfect Grange should be fulfilled in Year 3. I think someone who really made it a priority could get to this point by Year 2, but Year 1 is probably out of the question. Some people are very good at this game though, so I don’t want to say impossible.