linux-brain/include
Lukasz Luba 156eaacba3 PM: EM: Increase energy calculation precision
[ Upstream commit 7fcc17d0cb12938d2b3507973a6f93fc9ed2c7a1 ]

The Energy Model (EM) provides useful information about device power in
each performance state to other subsystems like: Energy Aware Scheduler
(EAS). The energy calculation in EAS does arithmetic operation based on
the EM em_cpu_energy(). Current implementation of that function uses
em_perf_state::cost as a pre-computed cost coefficient equal to:
cost = power * max_frequency / frequency.
The 'power' is expressed in milli-Watts (or in abstract scale).

There are corner cases when the EAS energy calculation for two Performance
Domains (PDs) return the same value. The EAS compares these values to
choose smaller one. It might happen that this values are equal due to
rounding error. In such scenario, we need better resolution, e.g. 1000
times better. To provide this possibility increase the resolution in the
em_perf_state::cost for 64-bit architectures. The cost of increasing
resolution on 32-bit is pretty high (64-bit division) and is not justified
since there are no new 32bit big.LITTLE EAS systems expected which would
benefit from this higher resolution.

This patch allows to avoid the rounding to milli-Watt errors, which might
occur in EAS energy estimation for each PD. The rounding error is common
for small tasks which have small utilization value.

There are two places in the code where it makes a difference:
1. In the find_energy_efficient_cpu() where we are searching for
best_delta. We might suffer there when two PDs return the same result,
like in the example below.

Scenario:
Low utilized system e.g. ~200 sum_util for PD0 and ~220 for PD1. There
are quite a few small tasks ~10-15 util. These tasks would suffer for
the rounding error. These utilization values are typical when running games
on Android. One of our partners has reported 5..10mA less battery drain
when running with increased resolution.

Some details:
We have two PDs: PD0 (big) and PD1 (little)
Let's compare w/o patch set ('old') and w/ patch set ('new')
We are comparing energy w/ task and w/o task placed in the PDs

a) 'old' w/o patch set, PD0
task_util = 13
cost = 480
sum_util_w/o_task = 215
sum_util_w_task = 228
scale_cpu = 1024
energy_w/o_task = 480 * 215 / 1024 = 100.78 => 100
energy_w_task = 480 * 228 / 1024 = 106.87 => 106
energy_diff = 106 - 100 = 6
(this is equal to 'old' PD1's energy_diff in 'c)')

b) 'new' w/ patch set, PD0
task_util = 13
cost = 480 * 1000 = 480000
sum_util_w/o_task = 215
sum_util_w_task = 228
energy_w/o_task = 480000 * 215 / 1024 = 100781
energy_w_task = 480000 * 228 / 1024  = 106875
energy_diff = 106875 - 100781 = 6094
(this is not equal to 'new' PD1's energy_diff in 'd)')

c) 'old' w/o patch set, PD1
task_util = 13
cost = 160
sum_util_w/o_task = 283
sum_util_w_task = 293
scale_cpu = 355
energy_w/o_task = 160 * 283 / 355 = 127.55 => 127
energy_w_task = 160 * 296 / 355 = 133.41 => 133
energy_diff = 133 - 127 = 6
(this is equal to 'old' PD0's energy_diff in 'a)')

d) 'new' w/ patch set, PD1
task_util = 13
cost = 160 * 1000 = 160000
sum_util_w/o_task = 283
sum_util_w_task = 293
scale_cpu = 355
energy_w/o_task = 160000 * 283 / 355 = 127549
energy_w_task = 160000 * 296 / 355 =   133408
energy_diff = 133408 - 127549 = 5859
(this is not equal to 'new' PD0's energy_diff in 'b)')

2. Difference in the 6% energy margin filter at the end of
find_energy_efficient_cpu(). With this patch the margin comparison also
has better resolution, so it's possible to have better task placement
thanks to that.

Fixes: 27871f7a8a ("PM: Introduce an Energy Model management framework")
Reported-by: CCJ Yeh <CCj.Yeh@mediatek.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15 09:47:33 +02:00
..
acpi ACPI: fix NULL pointer dereference 2021-08-08 09:04:08 +02:00
asm-generic vmlinux.lds.h: Handle clang's module.{c,d}tor sections 2021-08-18 08:57:04 +02:00
clocksource
crypto crypto: shash - avoid comparing pointers to exported functions under CFI 2021-07-14 16:53:13 +02:00
drm drm: Return -ENOTTY for non-drm ioctls 2021-07-28 13:31:01 +02:00
dt-bindings clk: imx8mn: Fix incorrect clock defines 2020-03-18 07:17:55 +01:00
keys certs: Add EFI_CERT_X509_GUID support for dbx entries 2021-06-30 08:47:55 -04:00
kvm
linux PM: EM: Increase energy calculation precision 2021-09-15 09:47:33 +02:00
math-emu
media media: subdev: disallow ioctl for saa6588/davinci 2021-07-19 08:53:17 +02:00
misc
net psample: Add a fwd declaration for skbuff 2021-08-18 08:56:59 +02:00
pcmcia
ras
rdma RDMA/umem: Fix signature of stub ib_umem_find_best_pgsz() 2020-10-29 09:57:47 +01:00
scsi scsi: iscsi: Fix conn use after free during resets 2021-07-20 16:10:43 +02:00
soc irqchip/eznps: Fix build error for !ARC700 builds 2020-09-17 13:47:47 +02:00
sound ALSA: hda: intel-nhlt: verify config type 2021-03-09 11:09:39 +01:00
target scsi: target: core: Add cmd length set before cmd complete 2021-03-17 17:03:45 +01:00
trace afs: Fix tracepoint string placement with built-in AFS 2021-07-28 13:30:58 +02:00
uapi bpf: Fix a typo of reuseport map in bpf.h. 2021-09-15 09:47:30 +02:00
vdso
video
xen Xen/gntdev: correct error checking in gntdev_map_grant_pages() 2021-02-23 15:02:26 +01:00