Reservoir simulation involve sparse iterative solvers for linear systems that arise from implicit discretizations of coupled PDEs from high-fidelity reservoir simulators. One of the major bottlenecks in these solvers is the sparse matrix-vector product. Sparse matrices are usually compressed in some format (e.g., CSR, ELL) before being processed. In this talk, we focus on the low-level design of a sparse matrix-vector (SpMV) kernel on GPUs. Most of the relevant contributions focus on introducing new formats that suit the GPU architecture such as the diagonal format for diagonal matrices and the blocked-ELL format for sparse matrices with small dense blocks. However, we target both generic and domain-specific implementations. Generic implementations basically target the CSR and ELL formats, in order to be part of the KAUST-BLAS library. More chances for further optimizations appear when the matrix has specific structure. In the talk, we will present the major design challenges and outlines, and preliminary results. The primary focus will be on the CSR format, where some preliminary results will be shown. The other bottleneck of reservoir simulations is the preconditioning in the sparse matrix solver. We investigate the possibility of a Fast Multipole Method based technique on GPUs as a compute-bound preconditioner.