Once you have trained your neural network to do some unique and interesting task, you might wonder how to make it available to colleagues, collaborators, or perhaps the world. One of the best ways to do that is to create a REST-based microservice. Then anyone with the URL can make a request and get an answer from your neural network. We'll show how three technologies come together to make that possible: 1. TensorRT provides low-latency, high-throughput inference; 2. Custom layer support in TensorRT allows you to express your unique deep learning secret sauce within TensorRT; 3. GPU Rest Engine gives you a fast and easy way to create a GPU-powered microservice. We'll show the steps necessary for you to start creating your own deep learning-powered microservices.