Understanding and characterizing performance problems of CPU-GPU programs, as well as providing insightful feedback to help guide programmer towards tuning their applications is critical to improving developer productivity. HPCToolkit is a start-of-the-art performance analysis tool that employs statistical sampling of timers and hardware counters, and attributes performance metrics to the hierarchical calling context. We extend HPCToolkit to measure and attribute performance of hybrid CPU-GPU codes. We present CPU-GPU blame shifting - a technique to identify code regions that underutilize CPU and/or GPU compute resources. We demonstrate the effectiveness of our tools on diverse scientific codes such as hydrodynamics (LULESH), molecular dynamics (LAMMPS), and epidemiology simulation(GPU-EpiSimdemics).