All things about Optimization

There are many factors to consider when building a new warehouse facility. Making the right decisions with regards to both warehousing and distribution methods could make all the difference for your firm. This can range from storage requirements, the level of automation in material handling, and workforce availability. The most important among all is the Physical Location.

Physical Location

When zeroing on which warehouse to use, choosing the one with the best physical location is important. The first question to ask yourself is, which region are you looking to serve? Ensuring your product is stored in a region near your customers is important for prompt deliveries. This also factors into considering cost. Calculating landed transportation costs to facility from manufacturing, and expected transportation costs from facility to end customer, will help decide where you can afford to keep your product.

The location’s proximity to carrier facilities should also be taken into consideration. Look for an all-encompassing solution that offers both warehousing and transportation to get the most value out of your investment, or ensure your storage facility is as close to your carrier as possible.

The general objective of transport optimization is minimizing transport cost subjected to the customer service policy. Normally, the transport solution affects inventory carrying and warehousing cost considerably.

Consider a simple scenario – A company stores all of its products in a single warehouse. All of its customers are located in the eastern sea board of the US. The management wishes to determine the best location for their warehouse in order to minimize total transportation costs. The number of shipments made to each of its customers for a period is given. The main objective is to determine the warehouse location based on its latitude and longitude coordinates.

Optimize Warehouse Location in Excel

The input for this problem is the location of each address identified by its lat-long. The user is asked to fill in the shipping addresses of the customer and the number of shipments in a given period, and the spreadsheet works tirelessly behind the scenes to determine the optimal lat-long coordinates of the warehouse to minimize transportation costs. Once you have filled out the shipping point addresses, simply click on “Geocode” command button. Columns C, D, and E are computed using Google Geo web service.

The distance formula is based on spherical law of cosines and bears more accuracy than the straightforward orthogonal distance. R is 69 and is based on earth’s curvature and is used in calculating latitude and longitude distances for cities.

d = acos(sin θ₁ * sin θ₂ + cos θ₁* cos θ₂ * cos Δλ ) * R

Geo-coding and distance computation are done behind the scenes by a free web service from Google. A Caveat – Google is nice enough to geo-code for free but they won’t put up with thousands of requests or attempts per hour from the same IP. After about 10 seconds or so, they will stop sending coordinates for a brief period. You may occasionally get “Not Found” message. The solution is to wait for few seconds and try again. The underlying code has an intentional delay of 2 seconds for every address that needs to be geo-coded. If you persistently get the “Not Found” message check to make sure that your address is as complete as it can be. Adding things like the province and the country can help Google narrow its search and provide a better result.

Setting up Solver

The decision variables are the latitude and longitude values of the warehouse location in columns H3 and I3 respectively.

Latitude and Longitude coordinates are the only constraint to this problem and they must be between 0 and 120 degrees. We only need to add the constraint that they be less than or equal to 120 and they must be non-negative.

The objective function to minimize the total distance between the warehouse and the shipment addresses is set to cell J12. It is the sum of the array product between the column of distances and the column of shipments made to each shipping addresses from the input table. The column range for shipments is “WhouseShipments” and distance as “WHouseDistance”. Thus, the formula for the objective function becomes:

=SUMPRODUCT(WhouseShipments,WHouseDistance)

The decision variables, cells H3 and I3 are to be determined by the solver. Cells H3 and I3 must be non-negative. Choose “Evolutionary” as the solving method.

Now click on Options to view the Evolutionary Solver options. Check Assume Non-Negative and set the population size and mutation rate. Initial population size in this problem is 100 and mutation rate as 25%. Once the solver finishes its operation, you can view the number of iterations of the Genetic Algorithm in the Population and Results report. The outcome of the solution is shown below in the Population report.

Based on those 24 customer shipping points in this example, the solver by evolutionary algorithm has found the coordinates of (41.74, 76.25) as optimal.

Download Optimize Warehouse Location Spreadsheet

#excel-google-geo-xml, #linear-programming, #lp-simplex, #transport-optimization

Revenue prediction and forecasts are typically done to be too right. They create a false sense of precision that ultimately falls short and does a disservice to managers who need accurate forecasts for planning. Hey, even meteorologists equipped with voluminous historical data and supercomputers cannot get it right all the time!

Why Monte Carlo Simulation?

Monte Carlo simulations can build valid probability models that give us much more accurate view into the future. There could be thousands of potential variables involved in monthly sales, and we cannot manage them all. MC simulation can help emulate real world combinations of few variables. Remember that MC simulation is not a magic bullet to create forecasts. For products that have seasonality and promotions, I typically use ARIMA, Stepwise Regression and Time Series techniques.

Revenue Prediction by Monte Carlo and Triangular Distribution

This simple example is driven by stochastic inputs. The stochastic input includes the units sold and sales price for a product without history. Since these values are non-deterministic, the user is asked to guess the best, worst, and most likely (mode) cases. A triangular random variable traverses the values that may occur between the best case and worst case values. For this revenue prediction application, we want to generate a random number from the triangular distribution for a given probability of this input. Hence, the triangular distribution is modeled with the following parameters:

a (worst case value)

b (best case value)

m (most likely value) or the mode; (a ≤ m ≤ b)

Both a and b serve as the boundary parameters (a < b)

Probability Density Function PDF can be defined as,

The inverse triangular distribution function is defined as,

We basically use this PDF to create values for the stochastic input such as units sold during the simulation runs. In every single run, these values are updated and recorded on a simulation calculation.

The spreadsheet in this post uses the above standard inverse cumulative method for generating a random number. Note that the Excel Rnd function is a pseudo random number generator and is not robust. You can use alternative uniform random number generator from tools such as Matlab or R.

The application is simple to use. Although few features require manual setup. You can begin by keying in best, worst and most likely (mode) of units sold and sale prices in columns C5 through E6. Enter total trials or simulation runs in column C10. I’ve simulated the revenue model using 5000 runs.

Next, start the simulation by clicking on the button. Higher number of runs makes your data set more realistic but it comes at the expense of computing speed and processor limitations. Cells B13 through E5012 are output of the simulation. The TraingleDist() takes as input the best case, worst case, and most likely values from the above input and outputs a random number that is found from this triangular distribution for a given probability.

The results are summary of basic stats from the simulation runs – Mean, Min, Max, and Sample Std. Deviation. They are in Cells K4 through K7.

Few Manual things to do:

So far, the VBA did the simulation from triangular distribution automatically. The histogram part requires few simple manual steps. If you plan on running the simulation for 5000 trials, you can ignore this step. There are 16 bins that represent the revenue data intervals. For number of trials other than 5000, simply change the following formula in cell K13.

=COUNTIFS($E$13:$E$5012,”>=”&J13,$E$13:$E$5012,”<“&J14)

Say you ran the simulation for 2000 runs, change the formula to

=COUNTIFS($E$13:$E$2012,”>=”&J13,$E$13:$E$2012,”<“&J14)

and drag it through K28.

The histogram plot is driven by the frequency distribution table above.

As you can notice that the results cluster around the mean of 24K. 5000 simulations may be unrealistic or not enough to get a good sample set. The limitations are mainly due to memory, and processor speed. Careful planning must be take into account for the long tail events. Questions to be addressed are – how to deal with just 16K revenue or how can a business support unexpected growth of 36K in revenue?