Background Valid and reliable outcome measures are needed to determine and compare treatment results of port wine stain (PWS) studies. Besides, uniformity in outcome measures is crucial to enable inter-study comparisons and meta-analyses. This study aimed to assess the heterogeneity in reported PWS outcome measures by mapping the (clinical) outcome measures currently used in prospective PWS studies. Methods OVID MEDLINE, OVID Embase, and CENTRAL were searched for prospective PWS studies published from 2005 to May 2020. Interventional studies with a clinical efficacy assessment were included. Two reviewers independently evaluated methodological quality using a modified Downs and Black checklist. Results In total, 85 studies comprising 3,310 patients were included in which 94 clinician/observer-reported clinical efficacy assessments had been performed using 46 different scoring systems. Eighty-one- studies employed a global assessment of PWS appearance/improvement, of which -82% was expressed as percentage improvement and categorized in 26 different scoring systems. A wide variety of other global and multi-item scoring systems was identified. As a result of outcome heterogeneity and insufficient data reporting, only 44% of studies could be directly compared. A minority of studies included patient-reported or objective outcomes. Thirteen studies of good quality were found. Conclusion Clinical PWS outcomes are highly heterogeneous, which hampers study comparisons and meta-analyses. Consensus-based development of a core outcome-set would benefit future research and clinical practice, especially considering the lack of high-quality trials.