(https://github.com/ChromiumWebApps/chromium/blob/master/build/android/pylib/perf/surface_stats_collector.py)
这段时间在做 Android 的性能测试,发现这种获取 fps 的方法比较方便,对其实现方式很好奇,但是在网上搜了下,相关的资料较少,大多都没讲清楚或,so 记录下自己的分析过程,给大家个参考。
def Start(self):
assert not self._collector_thread
if self._ClearSurfaceFlingerLatencyData():
self._get_data_event = threading.Event()
self._stop_event = threading.Event()
self._data_queue = Queue.Queue()
self._collector_thread = threading.Thread(target=self._CollectorThread)
self._collector_thread.start()
else:
self._use_legacy_method = True
self._surface_before = self._GetSurfaceStatsLegacy()
首先从 Start 开始,先 assert 没有搜集线程在跑,在判断 self._ClearSurfaceFlingerLatencyData():,看名字是清除 surface 数据的,跟进去看下:
def _ClearSurfaceFlingerLatencyData(self):
"""Clears the SurfaceFlinger latency data.
Returns:
True if SurfaceFlinger latency is supported by the device, otherwise
False.
"""
# The command returns nothing if it is supported, otherwise returns many
# lines of result just like 'dumpsys SurfaceFlinger'.
results = self._adb.RunShellCommand(
'dumpsys SurfaceFlinger --latency-clear SurfaceView')
return not len(results)
执行了 adb shell dumpsys SurfaceFlinger --latency-clear SurfaceView 命令,如果该命令返回空,说明硬件支持 SurfaceFlinger latency,如果返回很多行结果(像 adb shell dumpsys SurfaceFlinger 一样),说明硬件不支持 SurfaceFlinger latency,好吧,我的手机返回的是支持 SurfaceFlinger latency,我们先看支持的情况,回到 Start 函数:
def Start(self):
assert not self._collector_thread
if self._ClearSurfaceFlingerLatencyData():
self._get_data_event = threading.Event() #获取线程event对象
self._stop_event = threading.Event() #再来一个线程event对象
self._data_queue = Queue.Queue() #申请个数据队列
self._collector_thread = threading.Thread(target=self._CollectorThread) #线程,指向self._CollectorThread函数
self._collector_thread.start() #线程启动
else:
self._use_legacy_method = True
self._surface_before = self._GetSurfaceStatsLegacy()
去看看线程在干什么,跟到 self._CollectorThread 函数:
def _CollectorThread(self):
last_timestamp = 0 #最后时间戳设置成0
timestamps = [] #初始化一个时间戳数组
retries = 0 #重试次数初始为0
while not self._stop_event.is_set(): #self._stop_event第一次用,is_set()返回false,所以就进去了。
self._get_data_event.wait(1) #阻塞线程1秒
try:
refresh_period, new_timestamps = self._GetSurfaceFlingerFrameData() #看下面分析,返回(刷新周期,[B列时间戳数组])
if refresh_period is None or timestamps is None: #如果出错,重试3次
retries += 1
if retries < 3:
continue
if last_timestamp:
# Some data has already been collected, but either the app
# was closed or there's no new data. Signal the main thread and
# wait.
self._data_queue.put((None, None))
self._stop_event.wait()
break
raise Exception('Unable to get surface flinger latency data')
timestamps += [timestamp for timestamp in new_timestamps #
if timestamp > last_timestamp] #把循环得来的new_timestamps 放入总的timestamps
if len(timestamps):
last_timestamp = timestamps[-1] #更新最后的时间戳
if self._get_data_event.is_set():
self._get_data_event.clear()
self._data_queue.put((refresh_period, timestamps)) #把得到的数据放入数据队列
timestamps = []
except Exception as e:
# On any error, before aborting, put the exception into _data_queue to
# prevent the main thread from waiting at _data_queue.get() infinitely.
self._data_queue.put(e)
raise
self._GetSurfaceFlingerFrameData() 函数分析:
def _GetSurfaceFlingerFrameData(self):
"""Returns collected SurfaceFlinger frame timing data.返回手机的SurfaceFlinger 帧数据
Returns:返回一个元组,(刷新周期,[时间戳数组])
A tuple containing:
- The display's nominal refresh period in seconds.
- A list of timestamps signifying frame presentation times in seconds.
The return value may be (None, None) if there was no data collected (for
example, if the app was closed before the collector thread has finished).
如果线程完成之前,app已经退出了,有可能返回None,None。
"""
#这段是介绍 dumpsys SurfaceFlinger --latency命令的
# adb shell dumpsys SurfaceFlinger --latency <window name>
# prints some information about the last 128 frames displayed in
# that window.
# The data returned looks like this:
# 16954612
# 7657467895508 7657482691352 7657493499756
# 7657484466553 7657499645964 7657511077881
# 7657500793457 7657516600576 7657527404785
# (...)
#
"""
# The first line is the refresh period (here 16.95 ms), it is followed 第一行是刷新周期,换算毫秒是16.95毫秒
# by 128 lines w/ 3 timestamps in nanosecond each: 后面跟着128行,每行3列数据
#第一列表示app开始绘制的时间
# A) when the app started to draw
#垂直同步软件把帧提交给硬件之前的瞬时时间戳,我理解是frame已经写入到共享内存的时间,后面就是交给硬件层去显示了
# B) the vsync immediately preceding SF submitting the frame to the h/w
#软件提交帧给硬件之后瞬时时间戳,这个应该是正在的显示在屏幕的时间吧
# C) timestamp immediately after SF submitted that frame to the h/w
#第一个时间戳和第三个时间戳之间的差别,就是帧延时
# The difference between the 1st and 3rd timestamp is the frame-latency.
#当帧延时超过一个刷新周期的边界,
# An interesting data is when the frame latency crosses a refresh period
# boundary, this can be calculated this way:
#C-A,除以刷新周期,如果C-A大于刷新周期,就是一次jank,掉帧?
# ceil((C - A) / refresh-period)
#每出现一次这种情况,我们有了一次jank
"""
# (each time the number above changes, we have a "jank").
# If this happens a lot during an animation, the animation appears
# janky, even if it runs at 60 fps in average.
#
# We use the special "SurfaceView" window name because the statistics for
# the activity's main window are not updated when the main web content is
# composited into a SurfaceView.
#下面运行下命令
results = self._adb.RunShellCommand(
'dumpsys SurfaceFlinger --latency SurfaceView',
log_result=logging.getLogger().isEnabledFor(logging.DEBUG))
#没有数据返回空
if not len(results):
return (None, None)
#定义个时间戳数组
timestamps = []
nanoseconds_per_second = 1e9 #纳秒转换成秒用的
refresh_period = long(results[0]) / nanoseconds_per_second #刷新周期,就是第一行的数据,转换成秒
#3d的东西不太懂,大概意思就是说我们查询某帧数据出问题的时候,系统会返回一个64位的int最大值,我们会忽略这列数据
# If a fence associated with a frame is still pending when we query the
# latency data, SurfaceFlinger gives the frame a timestamp of INT64_MAX.
# Since we only care about completed frames, we will ignore any timestamps
# with this value.
pending_fence_timestamp = (1 << 63) - 1
for line in results[1:]: #处理后面128行帧数据
fields = line.split()
if len(fields) != 3: #确认有3个数据待处理
continue
timestamp = long(fields[1]) #取中间的数据
if timestamp == pending_fence_timestamp:#忽略异常数据
continue
timestamp /= nanoseconds_per_second #这个时间转换成秒
timestamps.append(timestamp) #放入数据
return (refresh_period, timestamps) #(刷新周期,[时间戳数组])
start 函数暂时告一段落,总的来说初始化了一些数据,开启了一个数据收集线程。
下面来看 Stop 函数,就是收集结束时调用的。
def Stop(self):
self._StorePerfResults()
if self._collector_thread:
self._stop_event.set()
self._collector_thread.join()
self._collector_thread = None
调用了 self._StorePerfResults():
def _StorePerfResults(self):
if self._use_legacy_method: #如果用legacy方法,我们上面用的不是这种方法,先不看
surface_after = self._GetSurfaceStatsLegacy()
td = surface_after['timestamp'] - self._surface_before['timestamp']
seconds = td.seconds + td.microseconds / 1e6
frame_count = (surface_after['page_flip_count'] -
self._surface_before['page_flip_count'])
self._results.append(SurfaceStatsCollector.Result(
'avg_surface_fps', int(round(frame_count / seconds)), 'fps'))
return
# Non-legacy method.
assert self._collector_thread #确认收集数据的线程存在
(refresh_period, timestamps) = self._GetDataFromThread()# _GetDataFromThread取出线程收集的数据
if not refresh_period or not len(timestamps) >= 3:#数据异常的警告
if self._warn_about_empty_data:
logging.warning('Surface stat data is empty')
return
self._results.append(SurfaceStatsCollector.Result( #初始化一个数据结果对象
'refresh_period', refresh_period, 'seconds'))
self._results += self._CalculateResults(refresh_period, timestamps, '') #算结果
self._results += self._CalculateBuckets(refresh_period, timestamps)
看下数据结果对象的格式:
class Result(object):
def __init__(self, name, value, unit): #'refresh_period', refresh_period, 'seconds'
self.name = name
self.value = value
self.unit = unit
几个计算结果的静态方法
@staticmethod
def _GetNormalizedDeltas(data, refresh_period, min_normalized_delta=None):
deltas = [t2 - t1 for t1, t2 in zip(data, data[1:])]
if min_normalized_delta != None:
deltas = filter(lambda d: d / refresh_period >= min_normalized_delta,
deltas)
return (deltas, [delta / refresh_period for delta in deltas])
@staticmethod
def _CalculateResults(refresh_period, timestamps, result_suffix):
"""Returns a list of SurfaceStatsCollector.Result."""
frame_count = len(timestamps) #总共多少帧
seconds = timestamps[-1] - timestamps[0] #算时间戳查,得到总的时间
frame_lengths, normalized_frame_lengths = \
SurfaceStatsCollector._GetNormalizedDeltas(
timestamps, refresh_period, _MIN_NORMALIZED_FRAME_LENGTH)
"""
#下面这块不是很明白,开始说要用C-A/刷新周期,这里却只用的第二列B的数据
# 说下我对这块的理解,不知道对不对
# B列数据是每帧绘制完成提交到硬件的时间戳,单位是纳秒,从开机算起。
#_GetNormalizedDeltas返回的是:
# [B2-B1,B3-B2,B4-B3.....], [B2-B1/周期,B3-B2/周期,B4-B3/周期.....]对应代码变量:
#frame_lengths, normalized_frame_lengths,就是[每2帧之间的时间间隔]和[这个时间间隔/16.6667] 2个数组
#然后用[每2帧的时间间隔]数组和刷新周期,又调用了一次_GetNormalizedDeltas函数
#如果时间间隔差出现大于刷新周期的情况的话,就是一次jank,加起来就是jank_count,对应代码:
#sum(1 for change in jankiness
# if change > 0 and change < pause_threshold)
#这里有点难理解,详细说明下:
#调用了2次_GetNormalizedDeltas方法,第一次传进去的timestamps参数大概是[149546224125229,149546240966791...一堆一堆]
#然后返回的deltas是[1X16, 3X16, 1X16, 2X16, 3X16....],因为都是3列中的中间那列数据,那列数据是垂直同步时间,
#所有数据的间隔是刷新周期16的整数倍。(不明白自行脑补垂直同步:http://article.yeeyan.org/view/37503/304664)
#好,现在再一次调用_GetNormalizedDeltas(),返回的normalized_changes是[2, -2, 1, 1 ...]
#然后jankiness = [max(0, round(change)) for change in normalized_changes]
#jankiness = [2, 0, 1, 1....]
#so,change大于0且小于20(估计作者是认为比20大都卡死了吧),就算一次jank_count
"""
if len(frame_lengths) < frame_count - 1:
logging.warning('Skipping frame lengths that are too short.')
frame_count = len(frame_lengths) + 1
if len(frame_lengths) == 0:
raise Exception('No valid frames lengths found.')
length_changes, normalized_changes = \
SurfaceStatsCollector._GetNormalizedDeltas(
frame_lengths, refresh_period)
jankiness = [max(0, round(change)) for change in normalized_changes]
pause_threshold = 20
jank_count = sum(1 for change in jankiness
if change > 0 and change < pause_threshold)
return [ #返回结果,包过平均fps,和 jank的总数
SurfaceStatsCollector.Result(
'avg_surface_fps' + result_suffix,
int(round((frame_count - 1) / seconds)), 'fps'),
SurfaceStatsCollector.Result(
'jank_count' + result_suffix, jank_count, 'janks'),
SurfaceStatsCollector.Result(
'max_frame_delay' + result_suffix,
round(max(normalized_frame_lengths)),
'vsyncs'),
SurfaceStatsCollector.Result(
'frame_lengths' + result_suffix, normalized_frame_lengths,
'vsyncs'),
]
@staticmethod
def _CalculateBuckets(refresh_period, timestamps):
results = []
for pct in [0.99, 0.5]:
sliced = timestamps[min(int(-pct * len(timestamps)), -3) : ]
#print sliced
results += SurfaceStatsCollector._CalculateResults(
refresh_period, sliced, '_' + str(int(pct * 100)))
return results
第二种就简单了,开始和结束时,分别用 service call SurfaceFlinger 1013 取帧数的 index 和开始结束时间戳,
2 个 index 相减就是帧总数,在除以时间相减,就等到了平均 fps,so easy!但是感觉这个方法有点问题,用的时间是
当前电脑返回的时间,可能算出来结果不太准。
def _GetSurfaceStatsLegacy(self):
"""Legacy method (before JellyBean), returns the current Surface index
and timestamp.
Calculate FPS by measuring the difference of Surface index returned by
SurfaceFlinger in a period of time.
Returns:
Dict of {page_flip_count (or 0 if there was an error), timestamp}.
"""
results = self._adb.RunShellCommand('service call SurfaceFlinger 1013')
assert len(results) == 1
match = re.search('^Result: Parcel\((\w+)', results[0])
cur_surface = 0
if match:
try:
cur_surface = int(match.group(1), 16)
except Exception:
logging.error('Failed to parse current surface from ' + match.group(1))
else:
logging.warning('Failed to call SurfaceFlinger surface ' + results[0])
return {
'page_flip_count': cur_surface,
'timestamp': datetime.datetime.now(),
}
===========================
总的来说,2 种方法都不是绝对准确,从实际测试结果来看,还是能体现出效果的,可以当做一个量化的指标。