Walmart Labs has been charged with building the next-generation stores forecasting system, replacing the current system from JDA. Due to the size of the forecasting problem 52 weekly forecasts for roughly 500 million store-item combinations, generated every week we realized early on that we would have to use GPU computing if we wanted to move beyond simple forecasting approaches such as exponential smoothing. We have taken a multi-pronged approach to the problem of improving forecast accuracy while remaining within execution time windows using NVIDIA-supplied software such as XGBoost for forecasting, developing custom algorithms (some in CUDA) for various forecasting and forecasting-related processes, and moving to a RAPIDS-based feature generation pipeline. At the moment, roughly 20% of our items are being forecasted by the new system, and we expect to have 100% item coverage by the end of the year. In this talk we will outline our forecasting strategy both from an algorithmic and from a computational perspective. We will show how GPU computing has enabled us to significantly improve forecast accuracy, and highlight the key bottlenecks that we have been able to overcome. We will provide runtime comparisons of CPU vs GPU-based algorithms on our real-world problems, and describe how GPU-based development works for us (hint: its easy to do.) We will also describe our collaboration with NVIDIA, who have been extremely helpful, continuously refining their algorithms and tools to better meet the needs of industry, and what tools and capabilities we see being especially useful for our path forward.