Commit Graph

388 Commits

Author SHA1 Message Date
Daniel Lezcano b57d62862d thermal/core: Remove unneeded ida_destroy()
As per documentation for the ida_destroy() function: "If the IDA is
already empty, there is no need to call this function."

The thermal framework is in the init sequence, so the ida was not yet
used and consequently it is empty in case of error.

There is no need to call ida_destroy(), let's remove the calls.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-25 16:40:39 +01:00
Daniel Lezcano 58d1c9fd0e thermal/core: Fix unregistering netlink at thermal init time
The thermal subsystem initialization miss an netlink unregistering
function in the error. Add it.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-25 16:40:39 +01:00
Viresh Kumar 47e3f00074 thermal: core: Use device_unregister() instead of device_del/put()
Lets not open code device_unregister() unnecessarily.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-24 20:22:55 +01:00
Viresh Kumar e398421fd0 thermal: core: Move cdev cleanup to thermal_release()
thermal_release() already frees cdev, let it do rest of the cleanup as
well in order to simplify the error paths in
__thermal_cooling_device_register().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-24 20:21:49 +01:00
Rafael J. Wysocki a2c81dc59d Merge back thermal control material for 6.3. 2023-01-23 18:52:53 +01:00
Viresh Kumar 6c54b7bc8a thermal: core: call put_device() only after device_register() fails
put_device() shouldn't be called before a prior call to
device_register(). __thermal_cooling_device_register() doesn't follow
that properly and needs fixing. Also
thermal_cooling_device_destroy_sysfs() is getting called unnecessarily
on few error paths.

Fix all this by placing the calls at the right place.

Based on initial work done by Caleb Connolly.

Fixes: 4748f9687c ("thermal: core: fix some possible name leaks in error paths")
Fixes: c408b3d1d9 ("thermal: Validate new state in cur_state_store()")
Reported-by: Caleb Connolly <caleb.connolly@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Frank Rowand <frowand.list@gmail.com>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Tested-by: Caleb Connolly <caleb.connolly@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-19 21:06:41 +01:00
Johan Hovold e6ec64f852 thermal/drivers/qcom: Fix set_trip_temp() deadlock
The set_trip_temp() callback is used when changing the trip temperature
through sysfs. As it is called with the thermal-zone-device lock held
it must not use thermal_zone_get_trip() directly or it will deadlock.

Fixes: 78c3e2429be8 ("thermal/drivers/qcom: Use generic thermal_zone_get_trip() function")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20221214131617.2447-2-johan+linaro@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
2023-01-06 14:14:48 +01:00
Daniel Lezcano 2e38a2a981 thermal/core: Add a generic thermal_zone_set_trip() function
The thermal zone ops defines a set_trip callback where we can invoke
the backend driver to set an interrupt for the next trip point
temperature being crossed the way up or down, or setting the low level
with the hysteresis.

The ops is only called from the thermal sysfs code where the userspace
has the ability to modify a trip point characteristic.

With the effort of encapsulating the thermal framework core code,
let's create a thermal_zone_set_trip() which is the writable side of
the thermal_zone_get_trip() and put there all the ops encapsulation.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20221003092602.1323944-4-daniel.lezcano@linaro.org
2023-01-06 14:14:47 +01:00
Daniel Lezcano 7c3d5c20dc thermal/core: Add a generic thermal_zone_get_trip() function
The thermal_zone_device_ops structure defines a set of ops family,
get_trip_temp(), get_trip_hyst(), get_trip_type(). Each of them is
returning a property of a trip point.

The result is the code is calling the ops everywhere to get a trip
point which is supposed to be defined in the backend driver. It is a
non-sense as a thermal trip can be generic and used by the backend
driver to declare its trip points.

Part of the thermal framework has been changed and all the OF thermal
drivers are using the same definition for the trip point and use a
thermal zone registration variant to pass those trip points which are
part of the thermal zone device structure.

Consequently, we can use a generic function to get the trip points
when they are stored in the thermal zone device structure.

This approach can be generalized to all the drivers and we can get rid
of the ops->get_trip_*. That will result to a much more simpler code
and make possible to rework how the thermal trip are handled in the
thermal core framework as discussed previously.

This change adds a function thermal_zone_get_trip() where we get the
thermal trip point structure which contains all the properties (type,
temp, hyst) instead of doing multiple calls to ops->get_trip_*.

That opens the door for trip point extension with more attributes. For
instance, replacing the trip points disabled bitmask with a 'disabled'
field in the structure.

Here we replace all the calls to ops->get_trip_* in the thermal core
code with a call to the thermal_zone_get_trip() function.

The thermal zone ops defines a callback to retrieve the critical
temperature. As the trip handling is being reworked, all the trip
points will be the same whatever the driver and consequently finding
the critical trip temperature will be just a loop to search for a
critical trip point type.

Provide such a generic function, so we encapsulate the ops
get_crit_temp() which can be removed when all the backend drivers are
using the generic trip points handling.

While at it, add the thermal_zone_get_num_trips() to encapsulate the
code more and reduce the grip with the thermal framework internals.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20221003092602.1323944-2-daniel.lezcano@linaro.org
2023-01-06 14:14:47 +01:00
Yang Yingliang 4748f9687c thermal: core: fix some possible name leaks in error paths
In some error paths before device_register(), the names allocated
by dev_set_name() are not freed. Move dev_set_name() front to
device_register(), so the name can be freed while calling
put_device().

Fixes: 1dd7128b83 ("thermal/core: Fix null pointer dereference in thermal_release()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-25 19:51:41 +01:00
Guenter Roeck b778b4d782 thermal/core: Protect thermal device operations against thermal device removal
Thermal device operations may be called after thermal zone device removal.
After thermal zone device removal, thermal zone device operations must
no longer be called. To prevent such calls from happening, ensure that
the thermal device is registered before executing any thermal device
operations.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck 05eeee2b51 thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
Protect access to thermal operations against thermal zone removal by
acquiring the thermal zone device mutex. After acquiring the mutex, check
if the thermal zone device is registered and abort the operation if not.

With this change, we can call __thermal_zone_device_update() instead of
thermal_zone_device_update() from trip_point_temp_store() and from
emul_temp_store(). Similar, we can call __thermal_zone_set_trips() instead
of thermal_zone_set_trips() from trip_point_hyst_store().

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck 1c439dec35 thermal/core: Introduce locked version of thermal_zone_device_update
In thermal_zone_device_set_mode(), the thermal zone mutex is released only
to be reacquired in the subsequent call to thermal_zone_device_update().

Introduce __thermal_zone_device_update(), which is similar to
thermal_zone_device_update() but has to be called with the thermal device
mutex held. Call the new function from thermal_zone_device_set_mode()
to avoid the extra thermal device mutex release/acquire sequence in that
function.

With the new function in place, re-implement thermal_zone_device_update()
as wrapper around __thermal_zone_device_update() to acquire and release
the thermal device mutex.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck 30b2ae07d3 thermal/core: Delete device under thermal device zone lock
Thermal device attributes may still be opened after unregistering
the thermal zone and deleting the thermal device.

Currently there is no protection against accessing thermal device
operations after unregistering a thermal zone. To enable adding
such protection, protect the device delete operation with the
thermal zone device mutex. This requires splitting the call to
device_unregister() into its components, device_del() and put_device().
Only the first call can be executed under mutex protection, since
put_device() may result in releasing the thermal zone device memory.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck d35f29ed9d thermal/core: Destroy thermal zone device mutex in release function
Accesses to thermal zones, and with it the thermal zone device mutex,
are still possible after the thermal zone device has been unregistered.
For example, thermal_zone_get_temp() can be called from temp_show()
in thermal_sysfs.c if the sysfs attribute was opened before the thermal
device was unregistered.

Move the call to mutex_destroy from thermal_zone_device_unregister()
to thermal_release() to ensure that it is only destroyed after it is
guaranteed to be no longer accessed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Dan Carpenter e49a1e1ee0 thermal/core: fix error code in __thermal_cooling_device_register()
Return an error pointer if ->get_max_state() fails.  The current code
returns NULL which will cause an oops in the callers.

Fixes: c408b3d1d9 ("thermal: Validate new state in cur_state_store()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-28 19:59:51 +02:00
Viresh Kumar c408b3d1d9 thermal: Validate new state in cur_state_store()
In cur_state_store(), the new state of the cooling device is received
from user-space and is not validated by the thermal core but the same is
left for the individual drivers to take care of. Apart from duplicating
the code it leaves possibility for introducing bugs where a driver may
not do it right.

Lets make the thermal core check the new state itself and store the max
value in the cooling device structure.

Link: https://lore.kernel.org/all/Y0ltRJRjO7AkawvE@kili/
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-25 18:58:11 +02:00
Lad Prabhakar c71d8035f1 thermal/core: Drop valid pointer check for type
Drop the valid pointer check for type in
thermal_zone_device_register_with_trips() as we already have it confirmed
for != NULL from the previous if block.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20220909181322.10933-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-10-04 11:21:30 +02:00
Sumeet Pawnikar 82b1ec794d thermal: core: Increase maximum number of trip points
On one of the Chrome system, if we define more than 12 trip points,
probe for thermal sensor fails with
"int3403 thermal: probe of INTC1046:03 failed with error -22"
and throws an error as
"thermal_sys: Error: Incorrect number of thermal trips".

The thermal_zone_device_register() interface needs maximum
number of trip points supported in a zone as an argument.
This number can't exceed THERMAL_MAX_TRIPS, which is currently
set to 12. To address this issue, THERMAL_MAX_TRIPS value
has to be increased.

This interface also has an argument to specify a mask of trips
which are writable. This mask is defined as an int.
This mask sets the ceiling for increasing maximum number of
supported trips. With the current implementation, maximum number
of trips can be supported is 31.

Also, THERMAL_MAX_TRIPS macro is used in one place only.
So, remove THERMAL_MAX_TRIPS macro and compare num_trips
directly with using a macro BITS_PER_TYPE(int)-1.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-30 19:50:10 +02:00
Wolfram Sang 1e6c8fb8b8 thermal: move from strlcpy() with unused retval to strscpy()
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31 21:13:35 +02:00
Daniel Lezcano 2f9d142c93 thermal/core: Fix lockdep_assert() warning
The function thermal_zone_device_is_enabled() must be called with the
thermal zone lock held. In the resume path, it is called without.

As the thermal_zone_device_is_enabled() is also checked in
thermal_zone_device_update(), do the check in resume() function is
pointless, except for saving an extra initialization which does not
hurt if it is done in all the cases.

Fixes: ca48ad71717dd ("thermal/core: Move the mutex inside the thermal_zone_device_update() function")
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
2022-08-17 14:09:39 +02:00
Daniel Lezcano a930da9bf5 thermal/core: Move the mutex inside the thermal_zone_device_update() function
All the different calls inside the thermal_zone_device_update()
function take the mutex.

The previous changes move the mutex out of the different functions,
like the throttling ops. Now that the mutexes are all at the same
level in the call stack for the thermal_zone_device_update() function,
they can be moved inside this one.

That has the benefit of:

1. Simplify the code by not having a plethora of places where the lock is taken

2. Probably closes more race windows because releasing the lock from
one line to another can give the opportunity to the thermal zone to change
its state in the meantime. For example, the thermal zone can be
enabled right after checking it is disabled.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-5-daniel.lezcano@linaro.org
2022-08-17 14:09:39 +02:00
Daniel Lezcano 670a5e356c thermal/core: Move the thermal zone lock out of the governors
All the governors throttling ops are taking/releasing the lock at the
beginning and the end of the function.

We can move the mutex to the throttling call site instead.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-4-daniel.lezcano@linaro.org
2022-08-17 14:09:39 +02:00
Daniel Lezcano 15a73839e3 thermal/core: Rework the monitoring a bit
The should_stop_polling() function wraps the function
thermal_zone_device_is_enabled().

The monitor_thermal_zone() function checks if the thermal zone is
enabled via the should_stop_polling() function.

However, the instant after checking the thermal zone is enabled, this
one can be disabled, so even if that reduces the race window, it does
not prevent that and the monitoring can be set again with the thermal
zone disabled.

For this reason, the function should_stop_polling() is replaced by a
direct check of the thermal zone mode with the mutex locks held, that
prevents the situation described above.

As the semantic is clear with the thermal_zone_is_enabled() function,
we can remove the should_stop_polling() function and replace the check
with the former function.

While at it, reorder the checks to improve the readability of the
monitor_thermal_zone() function.

In the future, the thermal_zone_device_disable() and the
thermal_zone_device_enable() functions should unset / set the polling
timer directly instead of relying on the next
thermal_zone_device_update() call to do that. That will make a
synchronous thermal zone mode change but the locking scheme should be
double checked for that which out of the scope of this change.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-2-daniel.lezcano@linaro.org
2022-08-17 14:09:39 +02:00
Daniel Lezcano 9662756a9a thermal/core: Rearm the monitoring only one time
The current code calls monitor_thermal_zone() inside the
handle_thermal_trip() function. But this one is called in a loop for
each trip point which means the monitoring is rearmed several times
for nothing (assuming there could be several passive and active trip
points).

Move the monitor_thermal_zone() function out of the
handle_thermal_trip() function and after the thermal trip loop, so the
timer will be disabled or rearmed one time.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-1-daniel.lezcano@linaro.org
2022-08-17 14:09:39 +02:00
Daniel Lezcano 48ad3b104b thermal/of: Make new code and old code co-exist
This transient change allows to use old and new OF together until all
the drivers are converted to use the new OF API.

This will go away when the old OF code will be removed.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-3-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-08-17 14:09:37 +02:00
Daniel Lezcano a921be53b4 thermal/core: Add missing EXPORT_SYMBOL_GPL
The function thermal_zone_device_register_with_trips() is not exported
for modules.

Add the missing EXPORT_SYMBOL_GPL().

Fixes: fae11de507 ("thermal/core: Add thermal_trip in thermal_zone")
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220810100731.749317-1-daniel.lezcano@linaro.org
2022-08-15 20:38:19 +02:00
Rafael J. Wysocki da9d01794e - Make per cpufreq / devfreq cooling device ops instead of using a
global variable, fix comments and rework the trace information
   (Lukasz Luba)
 
 - Add the include/dt-bindings/thermal.h under the area covered by the
   thermal maintainer in the MAINTAINERS file (Lukas Bulwahn)
 
 - Improve the error output by giving the sensor identification when a
   thermal zone failed to initialize, the DT bindings by changing the
   positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
   Sang)
 
 - Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
   Kozlowski)
 
 - Remove the pointless get_trend() function in the QCom, Ux500 and
   tegra thermal drivers, along with the unused DROP_FULL and
   RAISE_FULL trends definitions. Simplify the code by using clamp()
   macros (Daniel Lezcano)
 
 - Fix ref_table memory leak at probe time on the k3_j72xx bandgap
   (Bryan Brattlof)
 
 - Fix array underflow in prep_lookup_table (Dan Carpenter)
 
 - Add static annotation to the k3_j72xx_bandgap_j7* data structure
   (Jin Xiaoyun)
 
 - Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall)
 
 - Fix typos in comments on rzg2l (Biju Das)
 
 - Remove as unnecessary call to dev_err() as the error is already
   printed by the failing function on u8500 (Yang Li)
 
 - Register the thermal zones as hwmon sensors for the Qcom thermal
   sensors (Dmitry Baryshkov)
 
 - Fix 'tmon' tool compilation issue by adding phtread.h include
   (Markus Mayer)
 
 - Fix typo in the comments for the 'tmon' tool (Slark Xiao)
 
 - Consolidate the thermal core code by beginning to move the thermal
   trip structure from the thermal OF code as a generic structure to be
   used by the different sensors when registering a thermal zone
   (Daniel Lezcano)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmLkDEgACgkQqDIjiipP
 6E9PPAf/fZRYgzqgv68lYy2hnJBZEha7z76KyKKxbPATy65VQHzHBqWyPgOnZWx8
 xm26tlDJMFEGql/Sy5QetvnFdDqvY33Q0FBhDbmCdCp7vxxirDNKxXhGnxUggCIt
 PrloMzC9zjgdNaFTclf/ceCFNwHPnY8l5kxGHhVDn/l5vvGFB869HKMT+13FMCQM
 cKVNZY0F3BgmY0ouAMbXT2jwNm/FIYfXC9CFaQo9XhiTAvqU1h4BI08S8JdXsve0
 VVBi8MB0sBolWIQ/GVlC1IWj1FhxgMfvcfZAOlyia7I4kQz7K5wAHxiHnhA+GHsZ
 NdxVeGTIdmjIInvRxnsT7yR2HcitkA==
 =sDfh
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal control changes for 5.20-rc1 from Daniel Lezcano:

"- Make per cpufreq / devfreq cooling device ops instead of using a
   global variable, fix comments and rework the trace information
   (Lukasz Luba)

 - Add the include/dt-bindings/thermal.h under the area covered by the
   thermal maintainer in the MAINTAINERS file (Lukas Bulwahn)

 - Improve the error output by giving the sensor identification when a
   thermal zone failed to initialize, the DT bindings by changing the
   positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
   Sang)

 - Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
   Kozlowski)

 - Remove the pointless get_trend() function in the QCom, Ux500 and
   tegra thermal drivers, along with the unused DROP_FULL and
   RAISE_FULL trends definitions. Simplify the code by using clamp()
   macros (Daniel Lezcano)

 - Fix ref_table memory leak at probe time on the k3_j72xx bandgap
   (Bryan Brattlof)

 - Fix array underflow in prep_lookup_table (Dan Carpenter)

 - Add static annotation to the k3_j72xx_bandgap_j7* data structure
   (Jin Xiaoyun)

 - Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall)

 - Fix typos in comments on rzg2l (Biju Das)

 - Remove as unnecessary call to dev_err() as the error is already
   printed by the failing function on u8500 (Yang Li)

 - Register the thermal zones as hwmon sensors for the Qcom thermal
   sensors (Dmitry Baryshkov)

 - Fix 'tmon' tool compilation issue by adding phtread.h include
   (Markus Mayer)

 - Fix typo in the comments for the 'tmon' tool (Slark Xiao)

 - Consolidate the thermal core code by beginning to move the thermal
   trip structure from the thermal OF code as a generic structure to be
   used by the different sensors when registering a thermal zone
   (Daniel Lezcano)"

* tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (36 commits)
  thermal/of: Initialize trip points separately
  thermal/of: Use thermal trips stored in the thermal zone
  thermal/core: Add thermal_trip in thermal_zone
  thermal/core: Rename 'trips' to 'num_trips'
  thermal/core: Move thermal_set_delay_jiffies to static
  thermal/core: Remove unneeded EXPORT_SYMBOLS
  thermal/of: Move thermal_trip structure to thermal.h
  thermal/of: Remove the device node pointer for thermal_trip
  thermal/of: Replace device node match with device node search
  thermal/core: Remove duplicate information when an error occurs
  thermal/core: Avoid calling ->get_trip_temp() unnecessarily
  thermal/tools/tmon: Fix typo 'the the' in comment
  thermal/tools/tmon: Include pthread and time headers in tmon.h
  thermal/ti-soc-thermal: Fix comment typo
  thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
  thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
  thermal/drivers/u8500: Remove unnecessary print function dev_err()
  thermal/drivers/rzg2l: Fix comments
  thermal/drivers/sun8i: Fix typo in comment
  thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
  ...
2022-07-29 19:10:56 +02:00
Daniel Lezcano fae11de507 thermal/core: Add thermal_trip in thermal_zone
The thermal trip points are properties of a thermal zone and the
different sub systems should be able to save them in the thermal zone
structure instead of having their own definition.

Give the opportunity to the drivers to create a thermal zone with
thermal trips which will be accessible directly from the thermal core
framework.

As we added the thermal trip points structure in the thermal zone,
let's extend the thermal zone register function to have the thermal
trip structures as a parameter and store it in the 'trips' field of
the thermal zone structure.

The thermal zone contains the trip point, we can store them directly
when registering the thermal zone. That will allow another step
forward to remove the duplicate thermal zone structure we find in the
thermal_of code.

Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-9-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28 17:29:56 +02:00
Daniel Lezcano e5bfcd30f8 thermal/core: Rename 'trips' to 'num_trips'
In order to use thermal trips defined in the thermal structure, rename
the 'trips' field to 'num_trips' to have the 'trips' field containing the
thermal trip points.

Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-8-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28 17:29:56 +02:00
Daniel Lezcano e5f2cda61d thermal/core: Move thermal_set_delay_jiffies to static
The function 'thermal_set_delay_jiffies' is only used in
thermal_core.c but it is defined and implemented in a separate
file. Move the function to thermal_core.c and make it static.

Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220722200007.1839356-7-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28 17:29:55 +02:00
Daniel Lezcano 3f95ac3245 thermal/core: Remove duplicate information when an error occurs
The pr_err already tells it is an error, it is pointless to add the
'Error:' string in the messages. Remove them.

Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220722200007.1839356-2-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28 17:29:53 +02:00
Daniel Lezcano 50e53291e9 thermal/core: Avoid calling ->get_trip_temp() unnecessarily
As the trip temperature is already available when calling the function
handle_critical_trips(), pass it as a parameter instead of having this
function calling the ops again to retrieve the same data.

Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220718145038.1114379-2-daniel.lezcano@linaro.org
2022-07-28 17:29:52 +02:00
keliu 5a5b7d8d54 thermal: Directly use ida_alloc()/free()
Use ida_alloc()/ida_free() instead of deprecated
ida_simple_get()/ida_simple_remove() as recommended.

Signed-off-by: keliu <liuke94@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-14 16:05:35 +02:00
Yang Yingliang 98a160e898 thermal/core: Fix memory leak in __thermal_cooling_device_register()
I got memory leak as follows when doing fault injection test:

unreferenced object 0xffff888010080000 (size 264312):
  comm "182", pid 102533, jiffies 4296434960 (age 10.100s)
  hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
    ff ff ff ff ff ff ff ff 40 7f 1f b9 ff ff ff ff  ........@.......
  backtrace:
    [<0000000038b2f4fc>] kmalloc_order_trace+0x1d/0x110 mm/slab_common.c:969
    [<00000000ebcb8da5>] __kmalloc+0x373/0x420 include/linux/slab.h:510
    [<0000000084137f13>] thermal_cooling_device_setup_sysfs+0x15d/0x2d0 include/linux/slab.h:586
    [<00000000352b8755>] __thermal_cooling_device_register+0x332/0xa60 drivers/thermal/thermal_core.c:927
    [<00000000fb9f331b>] devm_thermal_of_cooling_device_register+0x6b/0xf0 drivers/thermal/thermal_core.c:1041
    [<000000009b8012d2>] max6650_probe.cold+0x557/0x6aa drivers/hwmon/max6650.c:211
    [<00000000da0b7e04>] i2c_device_probe+0x472/0xac0 drivers/i2c/i2c-core-base.c:561

If device_register() fails, thermal_cooling_device_destroy_sysfs() need be called
to free the memory allocated in thermal_cooling_device_setup_sysfs().

Fixes: 8ea229511e ("thermal: Add cooling device's statistics in sysfs")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220511020605.3096734-1-yangyingliang@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:53 +02:00
Manaf Meethalavalappu Pallikunhi 99b63316c3 thermal: core: Reset previous low and high trip during thermal zone init
During the suspend is in process, thermal_zone_device_update bails out
thermal zone re-evaluation for any sensor trip violation without
setting next valid trip to that sensor. It assumes during resume
it will re-evaluate same thermal zone and update trip. But when it is
in suspend temperature goes down and on resume path while updating
thermal zone if temperature is less than previously violated trip,
thermal zone set trip function evaluates the same previous high and
previous low trip as new high and low trip. Since there is no change
in high/low trip, it bails out from thermal zone set trip API without
setting any trip. It leads to a case where sensor high trip or low
trip is disabled forever even though thermal zone has a valid high
or low trip.

During thermal zone device init, reset thermal zone previous high
and low trip. It resolves above mentioned scenario.

Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-16 20:29:27 +01:00
Ziyang Xuan 0a5c26712f thermal/core: fix a UAF bug in __thermal_cooling_device_register()
When device_register() return failed, program will goto out_kfree_type
to release 'cdev->device' by put_device(). That will call thermal_release()
to free 'cdev'. But the follow-up processes access 'cdev' continually.
That trggers the UAF bug.

====================================================================
BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 dump_stack_lvl+0xe2/0x152
 print_address_description.constprop.0+0x21/0x140
 ? __thermal_cooling_device_register+0x75b/0xa90
 kasan_report.cold+0x7f/0x11b
 ? __thermal_cooling_device_register+0x75b/0xa90
 __thermal_cooling_device_register+0x75b/0xa90
 ? memset+0x20/0x40
 ? __sanitizer_cov_trace_pc+0x1d/0x50
 ? __devres_alloc_node+0x130/0x180
 devm_thermal_of_cooling_device_register+0x67/0xf0
 max6650_probe.cold+0x557/0x6aa
......

Freed by task 258:
 kasan_save_stack+0x1b/0x40
 kasan_set_track+0x1c/0x30
 kasan_set_free_info+0x20/0x30
 __kasan_slab_free+0x109/0x140
 kfree+0x117/0x4c0
 thermal_release+0xa0/0x110
 device_release+0xa7/0x240
 kobject_put+0x1ce/0x540
 put_device+0x20/0x30
 __thermal_cooling_device_register+0x731/0xa90
 devm_thermal_of_cooling_device_register+0x67/0xf0
 max6650_probe.cold+0x557/0x6aa [max6650]

Do not use 'cdev' again after put_device() to fix the problem like doing
in thermal_zone_device_register().

[dlezcano]: as requested by Rafael, change the affectation into two statements.

Fixes: 5848376181 ("thermal/drivers/core: Use a char pointer for the cooling device name")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-15 15:38:48 +02:00
Yuanzheng Song 1dd7128b83 thermal/core: Fix null pointer dereference in thermal_release()
If both dev_set_name() and device_register() failed, then null pointer
dereference occurs in thermal_release() which will use strncmp() to
compare the name.

So fix it by adding dev_set_name() return value check.

Signed-off-by: Yuanzheng Song <songyuanzheng@huawei.com>
Link: https://lore.kernel.org/r/20211015083230.67658-1-songyuanzheng@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-15 13:58:36 +02:00
Daniel Lezcano fc656fa14d thermal/drivers/netlink: Add the temperature when crossing a trip point
The slope of the temperature increase or decrease can be high and when
the temperature crosses the trip point, there could be a significant
difference between the trip temperature and the measured temperatures.

That forces the userspace to read the temperature back right after
receiving a trip violation notification.

In order to be efficient, give the temperature which resulted in the
trip violation.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20211001223323.1836640-1-daniel.lezcano@linaro.org
2021-10-07 15:41:38 +02:00
Dan Carpenter 1bb30b20b4 thermal/core: Potential buffer overflow in thermal_build_list_of_policies()
After printing the list of thermal governors, then this function prints
a newline character.  The problem is that "size" has not been updated
after printing the last governor.  This means that it can write one
character (the NUL terminator) beyond the end of the buffer.

Get rid of the "size" variable and just use "PAGE_SIZE - count" directly.

Fixes: 1b4f48494e ("thermal: core: group functions related to governor handling")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210916131342.GB25094@kili
2021-09-21 15:17:11 +02:00
Linus Torvalds f7ea4be434 - Add rk3568 sensor support (Finley Xiao)
- Add missing MODULE_DEVICE_TABLE for the Spreadtrum sensor (Chunyan
   Zhang)
 
 - Export additionnal attributes for the int340x thermal processor
   (Srinivas Pandruvada)
 
 - Add SC7280 compatible for the tsens driver (Rajeshwari Ravindra
   Kamble)
 
 - Fix kernel documentation for thermal_zone_device_unregister()	and
   use devm_platform_get_and_ioremap_resource() (Yang Yingliang)
 
 - Fix coefficient calculations for the rcar_gen3 sensor driver (Niklas
   Söderlund)
 
 - Fix shadowing variable rcar_gen3_ths_tj_1 (Geert Uytterhoeven)
 
 - Add missing of_node_put() for the iMX and Spreadtrum sensors
   (Krzysztof Kozlowski)
 
 - Add tegra3 thermal sensor DT bindings (Dmitry Osipenko)
 
 - Stop the thermal zone monitoring when unregistering it to prevent a
   temperature update without the 'get_temp' callback (Dmitry Osipenko)
 
 - Add rk3568 DT bindings, convert bindings to yaml schemas and add the
   corresponding compatible in the Rockchip sensor (Ezequiel Garcia)
 
 - Add the sc8180x compatible for the Qualcomm tsensor (Bjorn Andersson)
 
 - Use the find_first_zero_bit()	function instead of custom code (Andy
   Shevchenko)
 
 - Fix the kernel doc for the device cooling device (Yang Li)
 
 - Reorg the processor thermal int340x to set the scene for the PCI
   mmio driver (Srinivas Pandruvada)
 
 - Add PCI MMIO driver for the int340x processor thermal driver
   (Srinivas Pandruvada)
 
 - Add hwmon sensors for the mediatek sensor (Frank Wunderlich)
 
 - Fix warning for return value reported by Smatch for the int340x
   thermal processor (Srinivas Pandruvada)
 
 - Fix wrong register access and decoding for the int340x thermal
   processor (Srinivas Pandruvada)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmDkbdAACgkQqDIjiipP
 6E8pTwgAgc4RMdSBdNMVWP1Lc3Gprmg8uXMhma9ZlmJD+l3h4b4P+Zm7HjU+SPHI
 emvvqHgiWGw/ta/Fuhx9XnqTUjZznG3gMYohnfKe7jPgVTxmud+Yf0/E3dRrDNWl
 WNolS8rv4dLf1mjR+vGZ63KasK0Rz5Z4YxDV4kOvh+/VUhqg3XpZ1OTKOh/B9IWX
 pUTedq7hIZ44ZMINcwObvLNTeaoEhPNpgA4Rs07vQPYugY0V61HszqR/Mu+l8Rgp
 LiE8NUBANJ+8+wydHe/vP6lvOthh7YGSx4091rUe+X17qgfBDKCTsOyECnZ/UX+r
 aB1MaTAnr1H0dfM49yeoFcMSPc1rGA==
 =q0nG
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal updates from Daniel Lezcano:

 - Add rk3568 sensor support (Finley Xiao)

 - Add missing MODULE_DEVICE_TABLE for the Spreadtrum sensor (Chunyan
   Zhang)

 - Export additionnal attributes for the int340x thermal processor
   (Srinivas Pandruvada)

 - Add SC7280 compatible for the tsens driver (Rajeshwari Ravindra
   Kamble)

 - Fix kernel documentation for thermal_zone_device_unregister() and use
   devm_platform_get_and_ioremap_resource() (Yang Yingliang)

 - Fix coefficient calculations for the rcar_gen3 sensor driver (Niklas
   Söderlund)

 - Fix shadowing variable rcar_gen3_ths_tj_1 (Geert Uytterhoeven)

 - Add missing of_node_put() for the iMX and Spreadtrum sensors
   (Krzysztof Kozlowski)

 - Add tegra3 thermal sensor DT bindings (Dmitry Osipenko)

 - Stop the thermal zone monitoring when unregistering it to prevent a
   temperature update without the 'get_temp' callback (Dmitry Osipenko)

 - Add rk3568 DT bindings, convert bindings to yaml schemas and add the
   corresponding compatible in the Rockchip sensor (Ezequiel Garcia)

 - Add the sc8180x compatible for the Qualcomm tsensor (Bjorn Andersson)

 - Use the find_first_zero_bit() function instead of custom code (Andy
   Shevchenko)

 - Fix the kernel doc for the device cooling device (Yang Li)

 - Reorg the processor thermal int340x to set the scene for the PCI mmio
   driver (Srinivas Pandruvada)

 - Add PCI MMIO driver for the int340x processor thermal driver
   (Srinivas Pandruvada)

 - Add hwmon sensors for the mediatek sensor (Frank Wunderlich)

 - Fix warning for return value reported by Smatch for the int340x
   thermal processor (Srinivas Pandruvada)

 - Fix wrong register access and decoding for the int340x thermal
   processor (Srinivas Pandruvada)

* tag 'thermal-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (23 commits)
  thermal/drivers/int340x/processor_thermal: Fix tcc setting
  thermal/drivers/int340x/processor_thermal: Fix warning for return value
  thermal/drivers/mediatek: Add sensors-support
  thermal/drivers/int340x/processor_thermal: Add PCI MMIO based thermal driver
  thermal/drivers/int340x/processor_thermal: Split enumeration and processing part
  thermal: devfreq_cooling: Fix kernel-doc
  thermal/drivers/intel/intel_soc_dts_iosf: Switch to use find_first_zero_bit()
  dt-bindings: thermal: tsens: Add sc8180x compatible
  dt-bindings: rockchip-thermal: Support the RK3568 SoC compatible
  dt-bindings: thermal: convert rockchip-thermal to json-schema
  thermal/core/thermal_of: Stop zone device before unregistering it
  dt-bindings: thermal: Add binding for Tegra30 thermal sensor
  thermal/drivers/sprd: Add missing of_node_put for loop iteration
  thermal/drivers/imx_sc: Add missing of_node_put for loop iteration
  thermal/drivers/rcar_gen3_thermal: Do not shadow rcar_gen3_ths_tj_1
  thermal/drivers/rcar_gen3_thermal: Fix coefficient calculations
  thermal/drivers/st: Use devm_platform_get_and_ioremap_resource()
  thermal/core: Correct function name thermal_zone_device_unregister()
  dt-bindings: thermal: tsens: Add compatible string to TSENS binding for SC7280
  thermal/drivers/int340x: processor_thermal: Export additional attributes
  ...
2021-07-10 11:43:25 -07:00
Matti Vaittinen db0aeb4f07
thermal: Use generic HW-protection shutdown API
The hardware shutdown function was exported from kernel/reboot for
other subsystems to use. Logic is copied from the thermal_core. The
protection mutex is replaced by an atomic_t to allow calls also from
an IRQ context. Also the WARN() was replaced by pr_emerg() based on
discussions here:
https://lore.kernel.org/lkml/YJuPwAZroVZ%2Fw633@alley/
and here:
https://lore.kernel.org/linux-iommu/20210331093104.383705-4-geert+renesas@glider.be/

Use the exported API instead of implementing own just for the
thermal_core.

Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/5531e89d9e710f5d10e7cdce3ee58957335b9e03.1622628333.git.matti.vaittinen@fi.rohmeurope.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-21 13:08:37 +01:00
Yang Yingliang a052b5118f thermal/core: Correct function name thermal_zone_device_unregister()
Fix the following make W=1 kernel build warning:

  drivers/thermal/thermal_core.c:1376: warning: expecting prototype for thermal_device_unregister(). Prototype was for thermal_zone_device_unregister() instead

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210517051020.3463536-1-yangyingliang@huawei.com
2021-06-12 21:10:06 +02:00
Thara Gopinath d60d6e7adf thermal/core: Remove thermal_notify_framework
thermal_notify_framework just updates for a single trip point where as
thermal_zone_device_update does other bookkeeping like updating the
temperature of the thermal zone and setting the next trip point. The only
driver that was using thermal_notify_framework was updated in the previous
patch to use thermal_zone_device_update instead. Since there are no users
for thermal_notify_framework remove it.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210122023406.3500424-3-thara.gopinath@linaro.org
2021-04-22 13:14:09 +02:00
Daniel Lezcano d44616c6cc thermal/core: Fix memory leak in the error path
Fix the following error:

 smatch warnings:
 drivers/thermal/thermal_core.c:1020 __thermal_cooling_device_register() warn: possible memory leak of 'cdev'

by freeing the cdev when exiting the function in the error path.

Fixes: 5848376181 ("thermal/drivers/core: Use a char pointer for the cooling device name")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210319202257.890848-1-daniel.lezcano@linaro.org
2021-04-15 13:21:00 +02:00
Daniel Lezcano 5848376181 thermal/drivers/core: Use a char pointer for the cooling device name
We want to have any kind of name for the cooling devices as we do no
longer want to rely on auto-numbering. Let's replace the cooling
device's fixed array by a char pointer to be allocated dynamically
when registering the cooling device, so we don't limit the length of
the name.

Rework the error path at the same time as we have to rollback the
allocations in case of error.

Tested with a dummy device having the name:
 "Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch"

A village on the island of Anglesey (Wales), known to have the longest
name in Europe.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20210314111333.16551-1-daniel.lezcano@linaro.org
2021-03-15 04:46:25 +01:00
Daniel Lezcano d0df264fbd thermal/core: Remove pointless thermal_zone_device_reset() function
The function thermal_zone_device_reset() is called in the
thermal_zone_device_register() which allocates and initialize the
structure. The passive field is already zero-ed by the allocation, the
function is useless.

Call directly thermal_zone_device_init() instead and
thermal_zone_device_reset().

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201222181110.1231977-1-daniel.lezcano@linaro.org
2021-01-19 22:23:49 +01:00
Daniel Lezcano b39d2dd5b5 thermal/core: Remove ms based delay fields
The code does no longer use the ms unit based fields to set the
delays as they are replaced by the jiffies.

Remove them and replace their user to use the jiffies version instead.

Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Peter Kästle <peter@piie.net>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201216220337.839878-3-daniel.lezcano@linaro.org
2021-01-19 22:23:49 +01:00
Daniel Lezcano 39a38808d0 thermal/core: Use precomputed jiffies for the polling
The delays are also stored in jiffies based unit. Use them instead of
the ms.

Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201216220337.839878-2-daniel.lezcano@linaro.org
2021-01-19 22:23:43 +01:00
Daniel Lezcano 17d399cd9c thermal/core: Precompute the delays from msecs to jiffies
The delays are stored in ms units and when the polling function is
called this delay is converted into jiffies at each call.

Instead of doing the conversion again and again, compute the jiffies
at init time and use the value directly when setting the polling.

Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201216220337.839878-1-daniel.lezcano@linaro.org
2021-01-19 22:23:38 +01:00
Daniel Lezcano 716072d065 thermal/core: Remove THERMAL_TRIPS_NONE test
The last site calling the thermal_zone_bind_cooling_device() function
with the THERMAL_TRIPS_NONE parameter was removed.

We can get rid of this test as no user of this function is calling
this function with this parameter.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201214233811.485669-5-daniel.lezcano@linaro.org
2021-01-19 22:23:25 +01:00
Daniel Lezcano a20b995b23 thermal/core: Remove unused functions rebind/unbind exception
The functions thermal_zone_device_rebind_exception and
thermal_zone_device_unbind_exception are not used from anywhere.

Remove that code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201214233811.485669-2-daniel.lezcano@linaro.org
2021-01-19 22:23:04 +01:00
Daniel Lezcano 04f111130e thermal/core: Remove notify ops
With the removal of the notifys user in a previous patches, the ops is no
longer needed, remove it.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20201210121514.25760-5-daniel.lezcano@linaro.org
2021-01-07 17:48:56 +01:00
Daniel Lezcano d7203eedf4 thermal/core: Add critical and hot ops
Currently there is no way to the sensors to directly call an ops in
interrupt mode without calling thermal_zone_device_update assuming all
the trip points are defined.

A sensor may want to do something special if a trip point is hot or
critical.

This patch adds the critical and hot ops to the thermal zone device,
so a sensor can directly invoke them or let the thermal framework to
call the sensor specific ones.

Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20201210121514.25760-2-daniel.lezcano@linaro.org
2020-12-11 14:11:13 +01:00
Daniel Lezcano 433178e758 thermal/core: Emit a warning if the thermal zone is updated without ops
The actual code is silently ignoring a thermal zone update when a
driver is requesting it without a get_temp ops set.

That looks not correct, as the caller should not have called this
function if the thermal zone is unable to read the temperature.

That makes the code less robust as the check won't detect the driver
is inconsistently using the thermal API and that does not help to
improve the framework as these circumvolutions hide the problem at the
source.

In order to detect the situation when it happens, let's add a warning
when the update is requested without the get_temp() ops set.

Any warning emitted will have to be fixed at the source of the
problem: the caller must not call thermal_zone_device_update if there
is not get_temp callback set.

Cc: Thara Gopinath <thara.gopinath@linaro.org>
Cc: Amit Kucheria <amitk@kernel.org>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20201210121514.25760-1-daniel.lezcano@linaro.org
2020-12-11 14:11:13 +01:00
Bernard Zhao 37b2539e63 drivers/thermal/core: Optimize trip points check
The trip points are checked one by one with multiple condition
branches where one condition is enough to disable the trip point.

Merge all these conditions in a single 'OR' statement.

Signed-off-by: Bernard Zhao <bernard@vivo.com>
Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201027013743.62392-1-bernard@vivo.com

[dlezcano] Changed patch description
2020-10-27 09:53:19 +01:00
Lukasz Luba 345a8af7ea thermal: core: Move power_actor_set_power into IPA
Since the power actor section has one function power_actor_set_power()
move it into Intelligent Power Allocation (IPA). There is no other user
of that helper function. It would also allow to remove the check of
cdev_is_power_actor() because the code which calls it in IPA already does
the needed check. Make the function static since only IPA use it.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201015112441.4056-5-lukasz.luba@arm.com
2020-10-27 09:44:32 +01:00
Lukasz Luba 87d2380260 thermal: core: Remove unused functions in power actor section
Since the Intelligent Power Allocation (IPA) uses different way to get
minimum and maximum power for a given cooling device, the helper functions
are not needed. There is no other code which uses them, so remove the
helper functions.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201015112441.4056-4-lukasz.luba@arm.com
2020-10-27 09:44:32 +01:00
Michael Kao 4ab17ed131 thermal: core: Add upper and lower limits to power_actor_set_power
The upper and lower limits of thermal throttle state in the
DT do not apply to the Intelligent Power Allocation (IPA) governor.
Add the clamping for cooling device upper and lower limits in the
power_actor_set_power() used by IPA.

Signed-off-by: Michael Kao <michael.kao@mediatek.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201007024332.30322-1-michael.kao@mediatek.com
2020-10-26 19:46:35 +01:00
zhuguangqing ecd1d2a3e4 thermal: cooling: Remove unused variable *tz
1. devfreq_cooling.c: The variable *tz is not used in
devfreq_cooling_get_requested_power(), devfreq_cooling_state2power()
and devfreq_cooling_power2state().

2. cpufreq_cooling.c: After 84fe2cab48, the variable *tz is not used
anymore in cpufreq_get_requested_power(), cpufreq_state2power() and
cpufreq_power2state().

Remove the variable *tz.

Signed-off-by: zhuguangqing <zhuguangqing@xiaomi.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200914071101.13575-1-zhuguangqing83@gmail.com
2020-10-12 12:08:36 +02:00
Qinglang Miao df3e647d68 thermal: core: remove unnecessary mutex_init()
The mutex poweroff_lock is initialized statically. It is
unnecessary to initialize by mutex_init().

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200916062139.191233-1-miaoqinglang@huawei.com
2020-10-12 12:08:35 +02:00
Dmitry Osipenko a5f785ce60 thermal: core: Fix use-after-free in thermal_zone_device_unregister()
The user-after-free bug in thermal_zone_device_unregister() is reported by
KASAN. It happens because struct thermal_zone_device is released during of
device_unregister() invocation, and hence the "tz" variable shouldn't be
touched by thermal_notify_tz_delete(tz->id).

Fixes: 55cdf0a283 ("thermal: core: Add notifications call in the framework")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200817235854.26816-1-digetx@gmail.com
2020-09-04 11:52:54 +02:00
Daniel Lezcano 25be77e588 thermal: core: Add thermal zone enable/disable notification
Now the calls to enable/disable a thermal zone are centralized in a
call to a function, we can add in these the corresponding netlink
notifications.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Link: https://lore.kernel.org/r/20200727231033.26512-1-daniel.lezcano@linaro.org
2020-07-29 10:21:48 +02:00
Thierry Reding 82aa68afa1 thermal: core: Fix thermal zone lookup by ID
When a thermal zone is looked up by an ID and no zone is found matching
that ID, the thermal_zone_get_by_id() function will return a pointer to
the thermal zone list head which isn't actually a valid thermal zone.

This can lead to a subsequent crash because a valid pointer is returned
to the called, but dereferencing that pointer as struct thermal_zone is
not safe.

Fixes: 329b064fbd ("thermal: core: Get thermal zone by id")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200724170105.2705467-1-thierry.reding@gmail.com
2020-07-24 19:11:47 +02:00
Daniel Lezcano 3f5a2cbe0f thermal: core: Move initialization after core initcall
The generic netlink is initialized at subsys_initcall, so far after
the thermal init routine and the thermal generic netlink family
initialization.

On ŝome platforms, that leads to a memory corruption.

The fix was sent to netdev@ to move the genetlink framework
initialization at core_initcall.

Move the thermal core initialization to postcore level which is very
close to core level.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Link: https://lore.kernel.org/r/20200717164217.18819-2-daniel.lezcano@linaro.org
2020-07-21 10:40:08 +02:00
Daniel Lezcano d2a89b5283 thermal: netlink: Improve the initcall ordering
The initcalls like to play joke. In our case, the thermal-netlink
initcall is called after the thermal-core initcall but this one sends
a notification before the former is initialized. No issue was spotted,
but it could lead to a memory corruption, so instead of relying on the
core_initcall for the thermal-netlink, let's initialize directly from
the thermal-core init routine, so we have full control of the init
ordering.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Link: https://lore.kernel.org/r/20200717164217.18819-1-daniel.lezcano@linaro.org
2020-07-21 10:40:08 +02:00
Daniel Lezcano 55cdf0a283 thermal: core: Add notifications call in the framework
The generic netlink protocol is implemented but the different
notification functions are not yet connected to the core code.

These changes add the notification calls in the different
corresponding places.

Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200706105538.2159-4-daniel.lezcano@linaro.org
2020-07-07 15:55:22 +02:00
Daniel Lezcano 329b064fbd thermal: core: Get thermal zone by id
The next patch will introduce the generic netlink protocol to handle
events, sampling and command from the thermal framework. In order to
deal with the thermal zone, it uses its unique identifier to
characterize it in the message. Passing an integer is more efficient
than passing an entire string.

This change provides a function returning back a thermal zone pointer
corresponding to the identifier passed as parameter.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200706105538.2159-2-daniel.lezcano@linaro.org
2020-07-07 15:55:21 +02:00
Daniel Lezcano 3d44a509c1 thermal: core: Add helpers to browse the cdev, tz and governor list
The cdev, tz and governor list, as well as their respective locks are
statically defined in the thermal_core.c file.

In order to give a sane access to these list, like browsing all the
thermal zones or all the cooling devices, let's define a set of
helpers where we pass a callback as a parameter to be called for each
thermal entity.

We keep the self-encapsulation and ensure the locks are correctly
taken when looking at the list.

Acked-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200706105538.2159-1-daniel.lezcano@linaro.org
2020-07-07 15:55:20 +02:00
Andrzej Pietrasiewicz 514acd00f9 thermal: Make thermal_zone_device_is_enabled() available to core only
This function is not needed by drivers.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200703104354.19657-4-andrzej.p@collabora.com
2020-07-07 01:26:07 +02:00
Andrzej Pietrasiewicz f5e50bf4d3 thermal: Rename set_mode() to change_mode()
set_mode() is only called when tzd's mode is about to change. Actual
setting is performed in thermal_core, in thermal_zone_device_set_mode().
The meaning of set_mode() callback is actually to notify the driver about
the mode being changed and giving the driver a chance to oppose such
change.

To better reflect the purpose of the method rename it to change_mode()

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-12-andrzej.p@collabora.com
2020-06-29 20:26:39 +02:00
Andrzej Pietrasiewicz b56bdff78e thermal: core: Stop polling DISABLED thermal devices
Polling DISABLED devices is not desired, as all such "disabled" devices
are meant to be handled by userspace. This patch introduces and uses
should_stop_polling() to decide whether the device should be polled or not.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-10-andrzej.p@collabora.com
2020-06-29 20:26:38 +02:00
Andrzej Pietrasiewicz 7f4957be0d thermal: Use mode helpers in drivers
Use thermal_zone_device_{en|dis}able() and thermal_zone_device_is_enabled().

Consequently, all set_mode() implementations in drivers:

- can stop modifying tzd's "mode" member,
- shall stop taking tzd's lock, as it is taken in the helpers
- shall stop calling thermal_zone_device_update() as it is called in the
helpers
- can assume they are called when the mode truly changes, so checks to
verify that can be dropped

Not providing set_mode() by a driver no longer prevents the core from
being able to set tzd's mode, so the relevant check in mode_store() is
removed.

Other comments:

- acpi/thermal.c: tz->thermal_zone->mode will be updated only after we
return from set_mode(), so use function parameter in thermal_set_mode()
instead, no need to call acpi_thermal_check() in set_mode()
- thermal/imx_thermal.c: regmap writes and mode assignment are done in
thermal_zone_device_{en|dis}able() and set_mode() callback
- thermal/intel/intel_quark_dts_thermal.c: soc_dts_{en|dis}able() are a
part of set_mode() callback, so they don't need to modify tzd->mode, and
don't need to fall back to the opposite mode if unsuccessful, as the return
value will be propagated to thermal_zone_device_{en|dis}able() and
ultimately tzd's member will not be changed in thermal_zone_device_set_mode().
- thermal/of-thermal.c: no need to set zone->mode to DISABLED in
of_parse_thermal_zones() as a tzd is kzalloc'ed so mode is DISABLED anyway

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-8-andrzej.p@collabora.com
2020-06-29 20:26:36 +02:00
Andrzej Pietrasiewicz ac5d9ecc74 thermal: Add mode helpers
Prepare for making the drivers not access tzd's private members.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
[staticize thermal_zone_device_set_mode()]
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-7-andrzej.p@collabora.com
2020-06-29 20:26:36 +02:00
Andrzej Pietrasiewicz 1ee14820fd thermal: remove get_mode() operation of drivers
get_mode() is now redundant, as the state is stored in struct
thermal_zone_device.

Consequently the "mode" attribute in sysfs can always be visible, because
it is always possible to get the mode from struct tzd.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-6-andrzej.p@collabora.com
2020-06-29 20:26:35 +02:00
Amit Kucheria 3f0cfea3dd thermal/core: Replace module.h with export.h
Thermal core cannot be modular, remove the unnecessary module.h include
and replace with export.h to handle EXPORT_SYMBOL family of macros.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/33af23406dcdb0c62dae1e6401446b997ccb449f.1589199124.git.amit.kucheria@linaro.org
2020-05-22 18:48:53 +02:00
Amit Kucheria 869495ccf5 thermal/core: Get rid of MODULE_* tags
The thermal framework can no longer be compiled as a module as of
commit 554b3529fe ("thermal/drivers/core: Remove the module Kconfig's
option"). Remove the MODULE_* tags.

Rui is mentioned in the copyright line at the top of the file and the
license is mentioned in the SPDX tags. So no loss of information.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/74339a09a55f8f3d86c4074fc2bf853a302d6186.1589199124.git.amit.kucheria@linaro.org
2020-05-22 18:48:53 +02:00
Daniel Lezcano 44fc73223e thermal: core: Remove pointless debug traces
The last temperature and the current temperature are show via a
dev_debug. The line before, those temperature are also traced.

It is pointless to duplicate the traces for the temperatures,
remove the dev_dbg traces.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Link: https://lore.kernel.org/r/20200331165449.30355-2-daniel.lezcano@linaro.org
2020-04-14 11:41:12 +02:00
Wei Wang 163b00cde7 thermal: Fix deadlock in thermal thermal_zone_device_check
1851799e1d ("thermal: Fix use-after-free when unregistering thermal zone
device") changed cancel_delayed_work to cancel_delayed_work_sync to avoid
a use-after-free issue. However, cancel_delayed_work_sync could be called
insides the WQ causing deadlock.

[54109.642398] c0   1162 kworker/u17:1   D    0 11030      2 0x00000000
[54109.642437] c0   1162 Workqueue: thermal_passive_wq thermal_zone_device_check
[54109.642447] c0   1162 Call trace:
[54109.642456] c0   1162  __switch_to+0x138/0x158
[54109.642467] c0   1162  __schedule+0xba4/0x1434
[54109.642480] c0   1162  schedule_timeout+0xa0/0xb28
[54109.642492] c0   1162  wait_for_common+0x138/0x2e8
[54109.642511] c0   1162  flush_work+0x348/0x40c
[54109.642522] c0   1162  __cancel_work_timer+0x180/0x218
[54109.642544] c0   1162  handle_thermal_trip+0x2c4/0x5a4
[54109.642553] c0   1162  thermal_zone_device_update+0x1b4/0x25c
[54109.642563] c0   1162  thermal_zone_device_check+0x18/0x24
[54109.642574] c0   1162  process_one_work+0x3cc/0x69c
[54109.642583] c0   1162  worker_thread+0x49c/0x7c0
[54109.642593] c0   1162  kthread+0x17c/0x1b0
[54109.642602] c0   1162  ret_from_fork+0x10/0x18
[54109.643051] c0   1162 kworker/u17:2   D    0 16245      2 0x00000000
[54109.643067] c0   1162 Workqueue: thermal_passive_wq thermal_zone_device_check
[54109.643077] c0   1162 Call trace:
[54109.643085] c0   1162  __switch_to+0x138/0x158
[54109.643095] c0   1162  __schedule+0xba4/0x1434
[54109.643104] c0   1162  schedule_timeout+0xa0/0xb28
[54109.643114] c0   1162  wait_for_common+0x138/0x2e8
[54109.643122] c0   1162  flush_work+0x348/0x40c
[54109.643131] c0   1162  __cancel_work_timer+0x180/0x218
[54109.643141] c0   1162  handle_thermal_trip+0x2c4/0x5a4
[54109.643150] c0   1162  thermal_zone_device_update+0x1b4/0x25c
[54109.643159] c0   1162  thermal_zone_device_check+0x18/0x24
[54109.643167] c0   1162  process_one_work+0x3cc/0x69c
[54109.643177] c0   1162  worker_thread+0x49c/0x7c0
[54109.643186] c0   1162  kthread+0x17c/0x1b0
[54109.643195] c0   1162  ret_from_fork+0x10/0x18
[54109.644500] c0   1162 cat             D    0  7766      1 0x00000001
[54109.644515] c0   1162 Call trace:
[54109.644524] c0   1162  __switch_to+0x138/0x158
[54109.644536] c0   1162  __schedule+0xba4/0x1434
[54109.644546] c0   1162  schedule_preempt_disabled+0x80/0xb0
[54109.644555] c0   1162  __mutex_lock+0x3a8/0x7f0
[54109.644563] c0   1162  __mutex_lock_slowpath+0x14/0x20
[54109.644575] c0   1162  thermal_zone_get_temp+0x84/0x360
[54109.644586] c0   1162  temp_show+0x30/0x78
[54109.644609] c0   1162  dev_attr_show+0x5c/0xf0
[54109.644628] c0   1162  sysfs_kf_seq_show+0xcc/0x1a4
[54109.644636] c0   1162  kernfs_seq_show+0x48/0x88
[54109.644656] c0   1162  seq_read+0x1f4/0x73c
[54109.644664] c0   1162  kernfs_fop_read+0x84/0x318
[54109.644683] c0   1162  __vfs_read+0x50/0x1bc
[54109.644692] c0   1162  vfs_read+0xa4/0x140
[54109.644701] c0   1162  SyS_read+0xbc/0x144
[54109.644708] c0   1162  el0_svc_naked+0x34/0x38
[54109.845800] c0   1162 D 720.000s 1->7766->7766 cat [panic]

Fixes: 1851799e1d ("thermal: Fix use-after-free when unregistering thermal zone device")
Cc: stable@vger.kernel.org
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-11-14 22:41:09 +08:00
Amit Kucheria ae16a688f6 thermal: Initialize thermal subsystem earlier
Now that the thermal framework is built-in, in order to facilitate
thermal mitigation as early as possible in the boot cycle, move the
thermal framework initialization to core_initcall.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/f8ff0ab4a8e9c2eca5a26fb2256365b26cb326ce.1571656015.git.amit.kucheria@linaro.org
2019-11-07 07:00:26 +01:00
Amit Kucheria f96c8e5015 thermal: Remove netlink support
There are no users of netlink messages for thermal inside the kernel.
Remove the code and adjust the documentation.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/8ff02cf62186c7a54fff325fad40a2e9ca3affa6.1571656014.git.amit.kucheria@linaro.org
2019-11-07 07:00:26 +01:00
Amit Kucheria 67eed44b8a thermal: Add some error messages
When registering a thermal zone device, we currently return -EINVAL in
four cases. This makes it a little hard to debug the real cause of the
failure.

Print some error messages to make it easier for developer to figure out
what happened.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24 09:56:08 +08:00
Ido Schimmel 1851799e1d thermal: Fix use-after-free when unregistering thermal zone device
thermal_zone_device_unregister() cancels the delayed work that polls the
thermal zone, but it does not wait for it to finish. This is racy with
respect to the freeing of the thermal zone device, which can result in a
use-after-free [1].

Fix this by waiting for the delayed work to finish before freeing the
thermal zone device. Note that thermal_zone_device_set_polling() is
never invoked from an atomic context, so it is safe to call
cancel_delayed_work_sync() that can block.

[1]
[  +0.002221] ==================================================================
[  +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0
[  +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17

[  +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701
[  +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[  +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check
[  +0.000012] Call Trace:
[  +0.000021]  dump_stack+0xa9/0x10e
[  +0.000020]  print_address_description.cold.2+0x9/0x25e
[  +0.000018]  __kasan_report.cold.3+0x78/0x9d
[  +0.000016]  kasan_report+0xe/0x20
[  +0.000016]  __mutex_lock+0x1076/0x11c0
[  +0.000014]  step_wise_throttle+0x72/0x150
[  +0.000018]  handle_thermal_trip+0x167/0x760
[  +0.000019]  thermal_zone_device_update+0x19e/0x5f0
[  +0.000019]  process_one_work+0x969/0x16f0
[  +0.000017]  worker_thread+0x91/0xc40
[  +0.000014]  kthread+0x33d/0x400
[  +0.000015]  ret_from_fork+0x3a/0x50

[  +0.000020] Allocated by task 1:
[  +0.000015]  save_stack+0x19/0x80
[  +0.000015]  __kasan_kmalloc.constprop.4+0xc1/0xd0
[  +0.000014]  kmem_cache_alloc_trace+0x152/0x320
[  +0.000015]  thermal_zone_device_register+0x1b4/0x13a0
[  +0.000015]  mlxsw_thermal_init+0xc92/0x23d0
[  +0.000014]  __mlxsw_core_bus_device_register+0x659/0x11b0
[  +0.000013]  mlxsw_core_bus_device_register+0x3d/0x90
[  +0.000013]  mlxsw_pci_probe+0x355/0x4b0
[  +0.000014]  local_pci_probe+0xc3/0x150
[  +0.000013]  pci_device_probe+0x280/0x410
[  +0.000013]  really_probe+0x26a/0xbb0
[  +0.000013]  driver_probe_device+0x208/0x2e0
[  +0.000013]  device_driver_attach+0xfe/0x140
[  +0.000013]  __driver_attach+0x110/0x310
[  +0.000013]  bus_for_each_dev+0x14b/0x1d0
[  +0.000013]  driver_register+0x1c0/0x400
[  +0.000015]  mlxsw_sp_module_init+0x5d/0xd3
[  +0.000014]  do_one_initcall+0x239/0x4dd
[  +0.000013]  kernel_init_freeable+0x42b/0x4e8
[  +0.000012]  kernel_init+0x11/0x18b
[  +0.000013]  ret_from_fork+0x3a/0x50

[  +0.000015] Freed by task 581:
[  +0.000013]  save_stack+0x19/0x80
[  +0.000014]  __kasan_slab_free+0x125/0x170
[  +0.000013]  kfree+0xf3/0x310
[  +0.000013]  thermal_release+0xc7/0xf0
[  +0.000014]  device_release+0x77/0x200
[  +0.000014]  kobject_put+0x1a8/0x4c0
[  +0.000014]  device_unregister+0x38/0xc0
[  +0.000014]  thermal_zone_device_unregister+0x54e/0x6a0
[  +0.000014]  mlxsw_thermal_fini+0x184/0x35a
[  +0.000014]  mlxsw_core_bus_device_unregister+0x10a/0x640
[  +0.000013]  mlxsw_devlink_core_bus_device_reload+0x92/0x210
[  +0.000015]  devlink_nl_cmd_reload+0x113/0x1f0
[  +0.000014]  genl_family_rcv_msg+0x700/0xee0
[  +0.000013]  genl_rcv_msg+0xca/0x170
[  +0.000013]  netlink_rcv_skb+0x137/0x3a0
[  +0.000012]  genl_rcv+0x29/0x40
[  +0.000013]  netlink_unicast+0x49b/0x660
[  +0.000013]  netlink_sendmsg+0x755/0xc90
[  +0.000013]  __sys_sendto+0x3de/0x430
[  +0.000013]  __x64_sys_sendto+0xe2/0x1b0
[  +0.000013]  do_syscall_64+0xa4/0x4d0
[  +0.000013]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  +0.000017] The buggy address belongs to the object at ffff8881e48e0008
               which belongs to the cache kmalloc-2k of size 2048
[  +0.000012] The buggy address is located 1096 bytes inside of
               2048-byte region [ffff8881e48e0008, ffff8881e48e0808)
[  +0.000007] The buggy address belongs to the page:
[  +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0
[  +0.000020] flags: 0x200000000010200(slab|head)
[  +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0
[  +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
[  +0.000007] page dumped because: kasan: bad access detected

[  +0.000012] Memory state around the buggy address:
[  +0.000012]  ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012]  ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012] >ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000008]                                                  ^
[  +0.000012]  ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012]  ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000007] ==================================================================

Fixes: b1569e99c7 ("ACPI: move thermal trip handling to generic thermal layer")
Reported-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24 09:56:08 +08:00
Yue Hu adc8749b15 thermal/drivers/core: Use put_device() if device_register() fails
Never directly free @dev after calling device_register(), even if it
returned an error! Always use put_device() to give up the reference
initialized. Clean up the rollback block also.

Signed-off-by: Yue Hu <huyue2@yulong.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24 09:56:08 +08:00
Daniel Lezcano 57c5b2ec90 thermal/drivers/core: Use governor table to initialize
Now that the governor table is in place and the macro allows to browse the
table, declare the governor so the entry is added in the governor table
in the init section.

The [un]register_thermal_governors function does no longer need to use the
exported [un]register thermal governor's specific function which in turn
call the [un]register_thermal_governor. The governors are fully
self-encapsulated.

The cyclic dependency is no longer needed, remove it.

Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-27 21:22:14 +08:00
Linus Torvalds 2c45e7fbc9 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:

 - Remove the 'module' Kconfig option for thermal subsystem framework
   because the thermal framework are required to be ready as early as
   possible to avoid overheat at boot time (Daniel Lezcano)

 - Fix a bug that thermal framework pokes disabled thermal zones upon
   resume (Wei Wang)

  - A couple of cleanups and trivial fixes on int340x thermal drivers
    (Srinivas Pandruvada, Zhang Rui, Sumeet Pawnikar)

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  drivers: thermal: processor_thermal: Downgrade error message
  mlxsw: Remove obsolete dependency on THERMAL=m
  hwmon/drivers/core: Simplify complex dependency
  thermal/drivers/core: Fix typo in the option name
  thermal/drivers/core: Remove depends on THERMAL in Kconfig
  thermal/drivers/core: Remove module unload code
  thermal/drivers/core: Remove the module Kconfig's option
  thermal: core: skip update disabled thermal zones after suspend
  thermal: make device_register's type argument const
  thermal: intel: int340x: processor_thermal_device: simplify to get driver data
  thermal/int3403_thermal: favor _TMP instead of PTYP
2019-05-16 16:16:18 -07:00
Guenter Roeck b4ab114cc6 thermal: Introduce devm_thermal_of_cooling_device_register
thermal_of_cooling_device_register() and thermal_cooling_device_register()
are typically called from driver probe functions, and
thermal_cooling_device_unregister() is called from remove functions. This
makes both a perfect candidate for device managed functions.

Introduce devm_thermal_of_cooling_device_register(). This function can
also be used to replace thermal_cooling_device_register() by passing a NULL
pointer as device node. The new function requires both struct device *
and struct device_node * as parameters since the struct device_node *
parameter is not always identical to dev->of_node.

Don't introduce a device managed remove function since it is not needed
at this point.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-14 07:00:30 -07:00
Zhang Rui 6df24c3e81 Merge branches 'thermal-core', 'thermal-built-it' and 'thermal-intel' into next 2019-05-07 21:54:11 +08:00
Daniel Lezcano 77e1dd46a1 thermal/drivers/core: Remove module unload code
Now the thermal core is no longer compiled as a module. Remove the
unloading module code and move the unregister function to the __init
section.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-06 20:35:24 +08:00
Wei Wang ff54bbd1be thermal: core: skip update disabled thermal zones after suspend
It is unnecessary to update disabled thermal zones post suspend and
sometimes leads error/warning in bad behaved thermal drivers.

Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-06 20:35:11 +08:00
Jean-Francois Dagenais f991de53a8 thermal: make device_register's type argument const
...because it can be, the buffer is strlcpy'd into a local buffer in a
thermal struct member.

Signed-off-by: Jean-Francois Dagenais <jeff.dagenais@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-05-06 20:35:11 +08:00
Wei Wang 964f4843a4 Thermal: do not clear passive state during system sleep
commit ff140fea84 ("Thermal: handle thermal zone device properly
during system sleep") added PM hook to call thermal zone reset during
sleep. However resetting thermal zone will also clear the passive state
and thus cancel the polling queue which leads the passive cooling device
state not being cleared properly after sleep.

thermal_pm_notify => thermal_zone_device_reset set passive to 0
thermal_zone_trip_update will skip update passive as `old_target ==
instance->target'.
monitor_thermal_zone => thermal_zone_device_set_polling will cancel
tz->poll_queue, so the cooling device state will not be changed
afterwards.

Reported-by: Kame Wang <kamewang@google.com>
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-11-30 16:53:13 +08:00
Lukasz Luba 5be52fccaf thermal: remove unused function parameter
Clean unused parameter from internal framework function.

Signed-off-by: Lukasz Luba <l.luba@partner.samsung.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-11-30 16:44:34 +08:00
Jeson Gao c2b59d279d thermal: core: using power_efficient_wq for thermal worker
For SMP systems, thermal worker should use power_efficient_wq in power
saving mode, that will make scheduler more flexible on selecting an active
core for running work handler to avoid keeping work handler always
running on a single core, that will save some power.

Even if 'power_efficient_wq' relevant configs are disabled
'system_freezable_power_efficient_wq' is identical to system_freezable_wq,
behavior is unchanged.

Signed-off-by: Jeson Gao <jeson.gao@unisoc.com>
Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-10-10 21:48:50 +08:00
Dmitry Osipenko 3c58776827 thermal: core: Fix use-after-free in thermal_cooling_device_destroy_sysfs
This patch fixes use-after-free that was detected by KASAN. The bug is
triggered on a CPUFreq driver module unload by freeing 'cdev' on device
unregister and then using the freed structure during of the cdev's sysfs
data destruction. The solution is to unregister the sysfs at first, then
destroy sysfs data and finally release the cooling device.

Cc: <stable@vger.kernel.org> # v4.17+
Fixes: 8ea229511e ("thermal: Add cooling device's statistics in sysfs")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-10-10 11:44:52 +08:00
Lina Iyer 7e3c03817f drivers: thermal: Update license to SPDX format
Update licences format for core thermal files.

Signed-off-by: Lina Iyer <ilina@codeaurora.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-05-30 14:46:17 +08:00
Viresh Kumar 33e678d47d thermal: Shorten name of sysfs callbacks
The naming isn't consistent across all sysfs callbacks in the thermal
core, some have a short name like type_show() and others have long names
like thermal_cooling_device_weight_show(). This patch tries to make it
consistent by shortening the name of sysfs callbacks.

Some of the sysfs files are named similarly for both thermal zone and
cooling device (like: type) and to avoid name clash between their
show/store routines, the cooling device specific sysfs callbacks are
prefixed with "cdev_".

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-05-22 10:07:09 +08:00
Viresh Kumar 8ea229511e thermal: Add cooling device's statistics in sysfs
This extends the sysfs interface for thermal cooling devices and exposes
some pretty useful statistics. These statistics have proven to be quite
useful specially while doing benchmarks related to the task scheduler,
where we want to make sure that nothing has disrupted the test,
specially the cooling device which may have put constraints on the CPUs.
The information exposed here tells us to what extent the CPUs were
constrained by the thermal framework.

The write-only "reset" file is used to reset the statistics.

The read-only "time_in_state_ms" file shows the time (in msec) spent by the
device in the respective cooling states, and it prints one line per
cooling state.

The read-only "total_trans" file shows single positive integer value
showing the total number of cooling state transitions the device has
gone through since the time the cooling device is registered or the time
when statistics were reset last.

The read-only "trans_table" file shows a two dimensional matrix, where
an entry <i,j> (row i, column j) represents the number of transitions
from State_i to State_j.

This is how the directory structure looks like for a single cooling
device:

$ ls -R /sys/class/thermal/cooling_device0/
/sys/class/thermal/cooling_device0/:
cur_state  max_state  power  stats  subsystem  type  uevent

/sys/class/thermal/cooling_device0/power:
autosuspend_delay_ms  runtime_active_time  runtime_suspended_time
control               runtime_status

/sys/class/thermal/cooling_device0/stats:
reset  time_in_state_ms  total_trans  trans_table

This is tested on ARM 64-bit Hisilicon hikey620 board running Ubuntu and
ARM 64-bit Hisilicon hikey960 board running Android.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-04-02 21:49:01 +08:00
Christophe Jaillet 9d9ca1f9f0 thermal: core: Fix resources release in error paths in thermal_zone_device_register()
Reorder error handling code in order to fix some resources leaks in some
cases:
   - 'tz' would leak if 'thermal_zone_create_device_groups()' fails
   - memory allocated by 'thermal_zone_create_device_groups()' would leak
     if 'device_register()' fails

With this patch, we now have 2 error handling paths: one before
'device_register()', and one after it.
This is needed because some resources are released in 'thermal_release()'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-08-11 11:34:07 +08:00
Christophe Jaillet 6a6cd25b58 thermal: core: Use the new 'thermal_zone_destroy_device_groups()' helper function
Simplify code by using the new 'thermal_zone_destroy_device_groups()'
helper function.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-08-11 11:34:00 +08:00
Icenowy Zheng 039f6cf5b5 thermal: core: fix some format issues on critical shutdown string
The critical shutdown notice string used to have some spaces missing,
which makes it not so pretty.

Add the spaces to satisfy usual English space rules.

Reported-by: Mingcong Bai <jeffbai@aosc.io>
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-08-08 17:18:26 +08:00
Colin Ian King c4b379d064 thermal: core: make thermal_emergency_poweroff static
Making thermal_emergency_poweroff static fixes sparse warning:

  drivers/thermal/thermal_core.c:6: warning: symbol
  'thermal_emergency_poweroff' was not declared. Should it be static?

Fixes: ef1d87e06a ("thermal: core: Add a back up thermal shutdown mechanism")
Acked-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2017-05-23 20:03:35 -07:00
Keerthy ef1d87e06a thermal: core: Add a back up thermal shutdown mechanism
orderly_poweroff is triggered when a graceful shutdown
of system is desired. This may be used in many critical states of the
kernel such as when subsystems detects conditions such as critical
temperature conditions. However, in certain conditions in system
boot up sequences like those in the middle of driver probes being
initiated, userspace will be unable to power off the system in a clean
manner and leaves the system in a critical state. In cases like these,
the /sbin/poweroff will return success (having forked off to attempt
powering off the system. However, the system overall will fail to
completely poweroff (since other modules will be probed) and the system
is still functional with no userspace (since that would have shut itself
off).

However, there is no clean way of detecting such failure of userspace
powering off the system. In such scenarios, it is necessary for a backup
workqueue to be able to force a shutdown of the system when orderly
shutdown is not successful after a configurable time period.

Reported-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-05-05 16:01:45 +08:00
Keerthy e441fd6866 thermal: core: Allow orderly_poweroff to be called only once
thermal_zone_device_check --> thermal_zone_device_update -->
handle_thermal_trip --> handle_critical_trips --> orderly_poweroff

The above sequence happens every 250/500 mS based on the configuration.
The orderly_poweroff function is getting called every 250/500 mS.
With a full fledged file system it takes at least 5-10 Seconds to
power off gracefully.

In that period due to the thermal_zone_device_check triggering
periodically the thermal work queues bombard with
orderly_poweroff calls multiple times eventually leading to
failures in gracefully powering off the system.

Make sure that orderly_poweroff is called only once.

Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-05-05 16:01:44 +08:00
Zhang Rui 6fefe19f58 Merge branches 'thermal-core', 'thermal-soc', 'thermal-intel' and 'ida-conversion' into next 2017-02-22 15:35:06 +08:00
Jacob von Chorus f53345e8cf thermal: core: move tz->device.groups cleanup to thermal_release
The device_unregister call in thermal_zone_device_unregister causes the
thermal_zone_device structure to be freed before the call to free the
dynamically allocated attribute groups. This leads to a kernel panic.

Furthermore, the 4 calls to free the trip point attribute structures
occur before the call to unregister the device, leading to a kernel
panic when sysfs attempts to access the attributes to remove them.

Here is an example of a kernel panic when the cpu thermal zones are
removed upon cpu offline:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: strlen+0x0/0x20
<snip>
Call Trace:
   ? kernfs_name_hash+0x17/0x80
   kernfs_find_ns+0x3f/0xd0
   kernfs_remove_by_name_ns+0x36/0xa0
   remove_files.isra.1+0x36/0x70
   sysfs_remove_group+0x44/0x90
   sysfs_remove_groups+0x2e/0x50
   device_remove_attrs+0x5e/0x90
   device_del+0x1ea/0x350
   device_unregister+0x1a/0x60
   thermal_zone_device_unregister+0x1f2/0x210
   pkg_thermal_cpu_offline+0x14f/0x1a0 [x86_pkg_temp_thermal]
   ? kzalloc.constprop.2+0x10/0x10 [x86_pkg_temp_thermal]
   cpuhp_invoke_callback+0x8d/0x3f0
   cpuhp_down_callbacks+0x42/0x80
   cpuhp_thread_fun+0x8b/0xf0
   smpboot_thread_fn+0x110/0x160
   kthread+0x101/0x140
   ? sort_range+0x30/0x30
   ? kthread_park+0x90/0x90
   ret_from_fork+0x25/0x30

This patch moves the kfree calls to clean up the dynamic attributes to
the thermal_class's thermal_zone_device release function.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Jacob von Chorus <jacobvonchorus@cwphoto.ca>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-01-06 13:04:00 +08:00
Matthew Wilcox b31ef8285b thermal core: convert ID allocation to IDA
The thermal core does not use the ability to look up pointers by ID, so
convert it from using an IDR to the more space-efficient IDA.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-01-04 12:47:28 +08:00
Linus Torvalds 9346116d14 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:

 - Thermal core code reorganization and cleanup. Two new files are
   created for thermal sysfs I/F code and thermal helper functions
   (Eduardo Valentin).

 - Sanitize hotplug and locking for x86_pkg_temp driver (Thomas
   Gleixner)

 - Update MAINTAINER file for pwm-fan driver and Samsung thermal driver
   (Lukasz Majewski)

 - Fix module auto-load for max77620, tango and db8500 thermal driver
   (Javier Martinez Canillas)

 - Fix a bug that thermal hwmon sysfs I/F returns wrong critical trip
   point temperature value (Krzysztof Kozlowski)

 - Add Skylake PCH 100 series support for intel_pch_thermal driver
   (OGAWA Hirofumi)

 - Small fixes and cleanups for platform thermal drivers (Julia Lawall,
   Luis Henriques, Leo Yan, Stephen Boyd, Shawn Lin, Javi Merino and
   Lukasz Luba)

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (76 commits)
  MAINTAINERS: Samsung: Update maintainer for PWM FAN and SAMSUNG THERMAL
  thermal/x86 pkg temp: Convert to hotplug state machine
  thermal/x86_pkg_temp: Sanitize package management
  thermal/x86_pkg_temp: Move work into package struct
  thermal/x86_pkg_temp: Move work scheduled flag into package struct
  thermal/x86_pkg_temp: Sanitize locking
  thermal/x86_pkg_temp: Cleanup code some more
  thermal/x86_pkg_temp: Cleanup namespace
  thermal/x86_pkg_temp: Get rid of ref counting
  thermal/x86_pkg_temp: Sanitize callback (de)initialization
  thermal/x86_pkg_temp: Replace open coded cpu search
  thermal/x86_pkg_temp: Remove redundant package search
  thermal/x86_pkg_temp: Cleanup thermal interrupt handling
  thermal: hwmon: Properly report critical temperature in sysfs
  devfreq_cooling: pass a pointer to devfreq in the power model callbacks
  devfreq_cooling: make the structs devfreq_cooling_xxx visible for all
  dt-bindings: rockchip-thermal: fix the misleading description
  thermal: rockchip: improve the warning log
  thermal: db8500: Fix module autoload
  thermal: tango: Fix module autoload
  ...
2016-12-13 09:00:28 -08:00
Eduardo Valentin 373f91d125 thermal: core: move slop and offset helpers to thermal_helpers.c
Reorganize code to reflect better placement.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 95e3ed1513 thermal: core: use kzalloc(sizeof(*ptr),...)
As a safety check, this patch changes thermal
core to check for pointer content size, instead of type size,
while allocating memory.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 38e7b549af thermal: core: improve kerneldoc entry of thermal_cooling_device_unregister
Improve description and keep 80 columns limit.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin b659a30d7b thermal: core: remove style warnings and checks
Removing several style issues in thermal code code.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 2a0b4c44ce thermal: core: remove void function return statements
Simply removing useless returns of void functions.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin eb7be329bd thermal: core: standardize line breaking alignment
Pass through the code to remove check suggested by
checkpatch.pl (alignment to parenthesis):
CHECK: Alignment should match open parenthesis

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 5027ba36cd thermal: core: small style fix when checking for __find_governor()
Remove style issue:
CHECK: Comparison to NULL could be written "!__find_governor"
+	if (__find_governor(governor->name) == NULL) {

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 7eb4bd723e thermal: core: remove FSF address in the GPL notice
Simplify the GPL notice by removing the FSF address.
No need to track FSF location in this file.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 81193e2e6b thermal: core: add a comment describing the device management section
comment describing the section with function to handle
registration, unregistration, binding, and unbinding of
thermal devices.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 712afbdfdf thermal: core: add a comment describing the power actor section
Simply marking the power actor section and adding a
comment describing it.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 8772e185f1 thermal: core: add a comment describing the main update loop
Simply marking the main update loop section and adding a
comment describing it.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 106339ab7e thermal: core: move notify to the zone update section
moving the helper function to closer to similar functions.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 4b0d3c2d3b thermal: core: add inline to print_bind_err_msg()
Given that this is simple wrapper, adding the inline flag.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin f502ab8440 thermal: core: move __bind() to where it is used
Moving the helper to closer where it is used.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 9b68ef89c9 thermal: core: fix couple of style issues on __bind() helper
Removing style issues on __bind() and its helpers.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 90f5b5bb7f thermal: core: move bind_tz() to where it is used
Moving the helper to closer where it is used.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 949aad839c thermal: core: move bind_cdev() to where it is used
Moving the helper to closer where it is used.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin f11997fa24 thermal: core: move __unbind() helper to where it is used
Simply moving the helper to closer where it is actually used.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 4f5163fac2 thermal: core: small style fix on __unbind() helper
Simply aligning to parenthesis.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin c30176fc6f thermal: core: move idr handling to device management section
Given that idr is only used to get id for thermal devices
(zones and cooling), makes sense to move the code closer.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 1b4f48494e thermal: core: group functions related to governor handling
Organize thermal core code to group the functions
handling with governor manipulation in one single section.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin cd221c7b63 thermal: core: introduce thermal_helpers.c
Here we have a simple code organization. This patch moves
functions that do not need to handle thermal core internal
data structure to thermal_helpers.c file.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 77dc4f9032 thermal: core: remove a couple of style issues on helpers
Reorganizing the code of helper functions to improve
readability and style, as recommended by checkpatch.pl.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 45cf2ec99c thermal: core: move cooling device sysfs to thermal_sysfs.c
This is a code reorganization, simply to concentrate
the sysfs handling functions in thermal_sysfs.c.

This patch moves the cooling device handling functions.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 99ea2eff91 thermal: core: move to_cooling_device macro to header file
Make the to_cooling_device() macro available across
files in thermal core.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin a369ee88f7 thermal: core: move thermal_zone sysfs to thermal_sysfs.c
This is a code reorganization, simply to concentrate
the code handling sysfs in a specific file: thermal_sysfs.c.

Right now, moving only the sysfs entries of thermal_zone_device.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 72afe8e549 thermal: core: match parenthesis on code alignment
Cosmetic change in the sysfs handling functions, as
recommended by checkpatch.pl.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 09544da9e8 thermal: core: treat correctly the return value of *scanf calls
This patch checks the return value of all calls to *scanf.
The check is to simply match the number of expect inputs.

The current code does not do any recovery in case the
number of treated inputs are different than the expected.
Therefore, keeping the same behavior.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin ba78da443b thermal: core: move to_thermal_zone() macro to header file
Simply making this macro available to other thermal core
files.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 97d2423bd9 thermal: core: split available_policies_show()
This patch creates a helper to build a list of available governors.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 6b885202d7 thermal: core: split policy_store
Similarly to passive_store, policy_store now is split
between thermal core data structure handling and sysfs handling.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 3d0055d2b2 thermal: core: split passive_store
Split passive_store between sysfs handling and thermal
core internal data handling.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 2a4806bf7a thermal: core: remove unnecessary device_remove() calls
Given that cdevs sysfs properties are already registered using
the dev.groups, there is no need to explicitly call device_remove()
for each property.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 4d0fe7490d thermal: core: move trips attributes to tz->device.groups
Finally, move the last thermal zone sysfs attributes to
tz->device.groups: trips attributes. This requires adding a
attribute_group to thermal_zone_device, creating it dynamically, and
then setting all trips attributes in it. The trips attribute is then
added to the tz->device.groups.

As the removal of all attributes are handled by device core, the device
remove calls are not needed anymore.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin e161aefb9a thermal: core: create tz->device.groups dynamically
This is a patch to allow adding groups created dynamically. For now we
create only the existing group. However, this is a preparation to allow
creating trip groups, which are determined only when the number of trips
are known at runtime.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 0a9de81907 thermal: core: move the trip attrs to the tz sysfs I/F section
Code reorganization to keep all the sysfs I/F of a thermal zone in the
same section.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 3bafb5e2a6 thermal: core: fix style on remove_trip_attrs()
Align to parentheses, removing checkpatch warning.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 9d934fc883 thermal: core: remove useless empty line
Fix style problem on create_trip_attrs();

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 1a7e7cc03f thermal: core: move power actor code out of sysfs I/F section
Simply reorganize code to keep only functions of sysfs interface
of thermal zone device together. Therefore, move the power actor code
out of the sysfs I/F section.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 059386f43e thermal: core: improve power actor documentation
Simple improvement on clarity and removal of checkpatch warning
in the documentation of power actor kernel doc.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 3a836bfe9f thermal: core: move passive attr to tz->device.groups
This patch moves the passive attribute to tz->device.groups. Moving the
passive attribute also requires a .is_visible() callback implementation
for its attribute group.

The logic behind the visibility of passive attribute is kept the same.
We only expose the passive attribute if the thermal driver has exposed
at least one passive trip point.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 8baa5dae60 thermal: core: move mode attribute to tz->device.groups
Moving mode attribute to tz->device.groups requires the implementation
of a .is_visible() callback. The condition returned by .is_visible() of
the mode attribute group is kept the same, we allow the attribute to be
visible only if ops->get_mode() is set by the thermal driver.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 66e554bde9 thermal: core: move emul_temp creation to tz->device.groups
emul_temp creation is dependent on a compile time
condition. Moving to tz->device.groups.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 1c600861fa thermal: core: use dev.groups to manage always present tz attributes
Thermal zones attributes are all being created using
device_create_file(). This has the disadvantage of making the code
complicated and sometimes we may miss the cleanup of them.

This patch starts to move the thermal zone sysfs attributes to the
dev.groups, so Linux device core manage them for us. For now, this patch
only moves those attributes are always present regardless of thermal
zone condition.

This change has also the advantage of cleaning up the thermal zone
parameters sysfs entries that are left unclean after device
registration.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 308f726ac8 thermal: core: group device_create_file() calls that are always created
Simple code reorganization to group files that are always created
when registering a thermal zone.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin ef1d8bff72 thermal: core: group thermal_zone DEVICE_ATTR's declarations
Simply reorganize the code to have all DEVICE_ATTR's
in one point in the file.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Eduardo Valentin 54fa38cc2e thermal: core: prevent zones with no types to be registered
There are APIs that rely on tz->type. This patch
prevent thermal zones without it to be registered.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23 10:06:12 +08:00
Johannes Berg 56989f6d85 genetlink: mark families as __ro_after_init
Now genl_register_family() is the only thing (other than the
users themselves, perhaps, but I didn't find any doing that)
writing to the family struct.

In all families that I found, genl_register_family() is only
called from __init functions (some indirectly, in which case
I've add __init annotations to clarifly things), so all can
actually be marked __ro_after_init.

This protects the data structure from accidental corruption.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27 16:16:09 -04:00
Johannes Berg 489111e5c2 genetlink: statically initialize families
Instead of providing macros/inline functions to initialize
the families, make all users initialize them statically and
get rid of the macros.

This reduces the kernel code size by about 1.6k on x86-64
(with allyesconfig).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27 16:16:09 -04:00
Johannes Berg a07ea4d994 genetlink: no longer support using static family IDs
Static family IDs have never really been used, the only
use case was the workaround I introduced for those users
that assumed their family ID was also their multicast
group ID.

Additionally, because static family IDs would never be
reserved by the generic netlink code, using a relatively
low ID would only work for built-in families that can be
registered immediately after generic netlink is started,
which is basically only the control family (apart from
the workaround code, which I also had to add code for so
it would reserve those IDs)

Thus, anything other than GENL_ID_GENERATE is flawed and
luckily not used except in the cases I mentioned. Move
those workarounds into a few lines of code, and then get
rid of GENL_ID_GENERATE entirely, making it more robust.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27 16:16:09 -04:00
Srinivas Pandruvada 0e70f466fb thermal: Enhance thermal_zone_device_update for events
Added one additional parameter to thermal_zone_device_update() to provide
caller with an optional capability to specify reason.
Currently this event is used by user space governor to trigger different
processing based on event code. Also it saves an additional call to read
temperature when the event is received.
The following events are cuurently defined:
- Unspecified event
- New temperature sample
- Trip point violated
- Trip point changed
- thermal device up and down
- thermal device power capability changed

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-09-27 14:35:21 +08:00
Sascha Hauer 060c034a97 thermal: Add support for hardware-tracked trip points
This adds support for hardware-tracked trip points to the device tree
thermal sensor framework.

The framework supports an arbitrary number of trip points. Whenever
the current temperature is updated, the trip points immediately
below and above the current temperature are found. A .set_trips
callback is then called with the temperatures. If there is no trip
point above or below the current temperature, the passed trip
temperature will be -INT_MAX or INT_MAX respectively. In this callback,
the driver should program the hardware such that it is notified
when either of these trip points are triggered. When a trip point
is triggered, the driver should call `thermal_zone_device_update'
for the respective thermal zone. This will cause the trip points
to be updated again.

If .set_trips is not implemented, the framework behaves as before.

This patch is based on an earlier version from Mikko Perttunen
<mikko.perttunen@kapsi.fi>

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: linux-pm@vger.kernel.org
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-09-27 14:02:16 +08:00
Rajendra Nayak 4a7069a32c thermal: core: export apis to get slope and offset
Add apis for platform thermal drivers to query for slope and offset
attributes, which might be needed for temperature calculations.

Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-09-27 14:02:16 +08:00
Michele Di Giorgio d0b7306d20 thermal: fix race condition when updating cooling device
When multiple thermal zones are bound to the same cooling device, multiple
kernel threads may want to update the cooling device state by calling
thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
condition. Consider the following situation with two kernel threads k1 and k2:

	    Thread k1				Thread k2
                                    ||
                                    ||  call thermal_cdev_update()
                                    ||      ...
                                    ||      set_cur_state(cdev, target);
    call power_actor_set_power()    ||
        ...                         ||
        instance->target = state;   ||
        cdev->updated = false;      ||
                                    ||      cdev->updated = true;
                                    ||      // completes execution
    call thermal_cdev_update()      ||
        // cdev->updated == true    ||
        return;                     ||
                                    \/
                                    time

k2 has already looped through the thermal instances looking for the deepest
cooling device state and is preempted right before setting cdev->updated to
true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
to false. Then, k1 is preempted and k2 continues the execution by setting
cdev->updated to true, therefore preventing k1 from performing the update.
Notice that this is not an issue if k2 looks at the instance->target modified by
k1 "after" it is assigned by k1. In fact, in this case the update will happen
anyway and k1 can safely return immediately from thermal_cdev_update().

This may lead to a situation where a thermal governor never updates the cooling
device. For example, this is the case for the step_wise governor: when calling
the function thermal_zone_trip_update(), the governor may always get a new state
equal to the old one (which, however, wasn't notified to the cooling device) and
will therefore skip the update.

CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Peter Feuerer <peter@piie.net>
Reported-by: Toby Huang <toby.huang@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-08 10:57:39 +08:00
Leo Yan 15333e3af1 thermal: use %d to print S32 parameters
Power allocator's parameters are S32 type, so use %d to print them.

Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-04-27 15:54:51 -07:00
Wei Ni 1d0fd42fa3 thermal: consistently use int for trip temp
The commit 17e8351a77 consistently use int for temperature,
however it missed a few in trip temperature and thermal_core.

In current codes, the trip->temperature used "unsigned long"
and zone->temperature used"int", if the temperature is negative
value, it will get wrong result when compare temperature with
trip temperature.

This patch can fix it.

Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-04-20 20:31:14 -07:00
Zhang Rui 81ad4276b5 Thermal: Ignore invalid trip points
In some cases, platform thermal driver may report invalid trip points,
thermal core should not take any action for these trip points.

This fixed a regression that bogus trip point starts to screw up thermal
control on some Lenovo laptops, after
commit bb431ba26c
Author: Zhang Rui <rui.zhang@intel.com>
Date:   Fri Oct 30 16:31:47 2015 +0800

    Thermal: initialize thermal zone device correctly

    After thermal zone device registered, as we have not read any
    temperature before, thus tz->temperature should not be 0,
    which actually means 0C, and thermal trend is not available.
    In this case, we need specially handling for the first
    thermal_zone_device_update().

    Both thermal core framework and step_wise governor is
    enhanced to handle this. And since the step_wise governor
    is the only one that uses trends, so it's the only thermal
    governor that needs to be updated.

    Tested-by: Manuel Krause <manuelkrause@netscape.net>
    Tested-by: szegad <szegadlo@poczta.onet.pl>
    Tested-by: prash <prash.n.rao@gmail.com>
    Tested-by: amish <ammdispose-arch@yahoo.com>
    Tested-by: Matthias <morpheusxyz123@yahoo.de>
    Reviewed-by: Javi Merino <javi.merino@arm.com>
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Chen Yu <yu.c.chen@intel.com>

CC: <stable@vger.kernel.org> #3.18+
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1317190
Link: https://bugzilla.kernel.org/show_bug.cgi?id=114551
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-03-18 14:10:57 +08:00
Zhang Rui 98d94507e1 Merge branches 'thermal-intel', 'thermal-suspend-fix' and 'thermal-soc' into next 2016-01-23 11:43:27 +08:00
Kuninori Morimoto ad74e46cb3 thermal: trip_point_temp_store() calls thermal_zone_device_update()
trip_point_temp_store() updates trip temperature. It should call
thermal_zone_device_update() immediately.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-01-06 18:06:39 -08:00
Chen Yu 4511f7166a Thermal: do thermal zone update after a cooling device registered
When a new cooling device is registered, we need to update the
thermal zone to set the new registered cooling device to a proper
state.

This fixes a problem that the system is cool, while the fan devices
are left running on full speed after boot, if fan device is registered
after thermal zone device.

Here is the history of why current patch looks like this:
https://patchwork.kernel.org/patch/7273041/

CC: <stable@vger.kernel.org> #3.18+
Reference:https://bugzilla.kernel.org/show_bug.cgi?id=92431
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-12-29 16:00:00 +08:00
Zhang Rui ff140fea84 Thermal: handle thermal zone device properly during system sleep
Current thermal code does not handle system sleep well because
1. the cooling device cooling state may be changed during suspend
2. the previous temperature reading becomes invalid after resumed because
   it is got before system sleep
3. updating thermal zone device during suspending/resuming
   is wrong because some devices may have already been suspended
   or may have not been resumed.

Thus, the proper way to do this is to cancel all thermal zone
device update requirements during suspend/resume, and after all
the devices have been resumed, reset and update every registered
thermal zone devices.

This also fixes a regression introduced by:
Commit 19593a1fb1 ("ACPI / fan: convert to platform driver")
Because, with above commit applied, all the fan devices are attached
to the acpi_general_pm_domain, and they are turned on by the pm_domain
automatically after resume, without the awareness of thermal core.

CC: <stable@vger.kernel.org> #3.18+
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Tested-by: Matthias <morpheusxyz123@yahoo.de>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-12-29 15:59:53 +08:00
Zhang Rui bb431ba26c Thermal: initialize thermal zone device correctly
After thermal zone device registered, as we have not read any
temperature before, thus tz->temperature should not be 0,
which actually means 0C, and thermal trend is not available.
In this case, we need specially handling for the first
thermal_zone_device_update().

Both thermal core framework and step_wise governor is
enhanced to handle this. And since the step_wise governor
is the only one that uses trends, so it's the only thermal
governor that needs to be updated.

CC: <stable@vger.kernel.org> #3.18+
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Tested-by: Matthias <morpheusxyz123@yahoo.de>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-12-29 15:59:44 +08:00
Javi Merino c973c3bcec thermal: Add a function to get the minimum power
The thermal core already has a function to get the maximum power of a
cooling device: power_actor_get_max_power().  Add a function to get the
minimum power of a cooling device.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-09-14 07:39:46 -07:00
Zhang Rui 5a924a07f8 Merge branches 'thermal-core' and 'thermal-intel' of .git into next 2015-09-02 10:08:02 +08:00
Sascha Hauer 934c93b8c1 thermal: Add comment explaining test for critical temperature
The code testing if a temperature should be emulated or not is
not obvious. Add a comment explaining why this test is done.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer 79e5421cf0 thermal: Use IS_ENABLED instead of #ifdef
Use IS_ENABLED(CONFIG_THERMAL_EMULATION) to make the code more readable
and to get rid of the addtional #ifdef around the variable definitions
in thermal_zone_get_temp().

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer dbdf2532b4 thermal: remove unnecessary call to thermal_zone_device_set_polling
When the thermal zone has no get_temp callback then thermal_zone_device_register()
calls thermal_zone_device_set_polling() with a polling delay of 0. This
only cancels the poll_queue. Since the poll_queue hasn't been scheduled this
is a no-op. Remove it.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer f6be058493 thermal: trivial: fix typo in comment
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer 17e8351a77 thermal: consistently use int for temperatures
The thermal code uses int, long and unsigned long for temperatures
in different places.

Using an unsigned type limits the thermal framework to positive
temperatures without need. Also several drivers currently will report
temperatures near UINT_MAX for temperatures below 0°C. This will probably
immediately shut the machine down due to overtemperature if started below
0°C.

'long' is 64bit on several architectures. This is not needed since INT_MAX °mC
is above the melting point of all known materials.

Consistently use a plain 'int' for temperatures throughout the thermal code and
the drivers. This only changes the places in the drivers where the temperature
is passed around as pointer, when drivers internally use another type this is
not changed.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Peter Feuerer <peter@piie.net>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Peter Feuerer <peter@piie.net>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Lukasz Majewski <l.majewski@samsung.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-acpi@vger.kernel.org
Cc: platform-driver-x86@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-omap@vger.kernel.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: lm-sensors@lm-sensors.org
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:50 +08:00
Ni Wade 25a0a5ce16 thermal: add available policies sysfs attribute
The Linux thermal framework support to change thermal governor
policy in userspace, but it can't show what available policies
supported.

This patch adds available_policies attribute to the thermal
framework, it can list the thermal governors which can be
used for a particular zone. This attribute is read only.

Signed-off-by: Wei Ni <wni@nvidia.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:50 +08:00
Viresh Kumar 528464eaa4 thermal: remove dangling 'weight_attr' device file
This file isn't getting removed while we unbind a device from thermal
zone. And this causes following messages when the device is registered
again:

WARNING: CPU: 0 PID: 2228 at /home/viresh/linux/fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x70()
sysfs: cannot create duplicate filename '/devices/virtual/thermal/thermal_zone0/cdev0_weight'
Modules linked in: cpufreq_dt(+) [last unloaded: cpufreq_dt]
CPU: 0 PID: 2228 Comm: insmod Not tainted 4.2.0-rc3-00059-g44fffd9473eb #272
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c00153e8>] (unwind_backtrace) from [<c0012368>] (show_stack+0x10/0x14)
[<c0012368>] (show_stack) from [<c053a684>] (dump_stack+0x84/0xc4)
[<c053a684>] (dump_stack) from [<c002284c>] (warn_slowpath_common+0x80/0xb0)
[<c002284c>] (warn_slowpath_common) from [<c00228ac>] (warn_slowpath_fmt+0x30/0x40)
[<c00228ac>] (warn_slowpath_fmt) from [<c012d524>] (sysfs_warn_dup+0x60/0x70)
[<c012d524>] (sysfs_warn_dup) from [<c012d244>] (sysfs_add_file_mode_ns+0x13c/0x190)
[<c012d244>] (sysfs_add_file_mode_ns) from [<c012d2d4>] (sysfs_create_file_ns+0x3c/0x48)
[<c012d2d4>] (sysfs_create_file_ns) from [<c03c04a8>] (thermal_zone_bind_cooling_device+0x260/0x358)
[<c03c04a8>] (thermal_zone_bind_cooling_device) from [<c03c2e70>] (of_thermal_bind+0x88/0xb4)
[<c03c2e70>] (of_thermal_bind) from [<c03c10d0>] (__thermal_cooling_device_register+0x17c/0x2e0)
[<c03c10d0>] (__thermal_cooling_device_register) from [<c03c3f50>] (__cpufreq_cooling_register+0x3a0/0x51c)
[<c03c3f50>] (__cpufreq_cooling_register) from [<bf00505c>] (cpufreq_ready+0x44/0x88 [cpufreq_dt])
[<bf00505c>] (cpufreq_ready [cpufreq_dt]) from [<c03d6c30>] (cpufreq_add_dev+0x4a0/0x7dc)
[<c03d6c30>] (cpufreq_add_dev) from [<c02cd3ec>] (subsys_interface_register+0x94/0xd8)
[<c02cd3ec>] (subsys_interface_register) from [<c03d785c>] (cpufreq_register_driver+0x10c/0x1f0)
[<c03d785c>] (cpufreq_register_driver) from [<bf0057d4>] (dt_cpufreq_probe+0x60/0x8c [cpufreq_dt])
[<bf0057d4>] (dt_cpufreq_probe [cpufreq_dt]) from [<c02d03e4>] (platform_drv_probe+0x44/0xa4)
[<c02d03e4>] (platform_drv_probe) from [<c02cead8>] (driver_probe_device+0x174/0x2b4)
[<c02cead8>] (driver_probe_device) from [<c02ceca4>] (__driver_attach+0x8c/0x90)
[<c02ceca4>] (__driver_attach) from [<c02cd078>] (bus_for_each_dev+0x68/0x9c)
[<c02cd078>] (bus_for_each_dev) from [<c02ce2f0>] (bus_add_driver+0x19c/0x214)
[<c02ce2f0>] (bus_add_driver) from [<c02cf490>] (driver_register+0x78/0xf8)
[<c02cf490>] (driver_register) from [<c0009710>] (do_one_initcall+0x8c/0x1d4)
[<c0009710>] (do_one_initcall) from [<c05396b0>] (do_init_module+0x5c/0x1b8)
[<c05396b0>] (do_init_module) from [<c0086490>] (load_module+0xd34/0xed8)
[<c0086490>] (load_module) from [<c0086704>] (SyS_init_module+0xd0/0x120)
[<c0086704>] (SyS_init_module) from [<c000f480>] (ret_fast_syscall+0x0/0x3c)
---[ end trace 3be0e7b7dc6e3c4f ]---

Fixes: db91651311 ("thermal: export weight to sysfs")
Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-08-02 19:36:57 -07:00
Eduardo Valentin 9d0be7f481 thermal: support slope and offset coefficients
It is common to have a linear extrapolation from
the current sensor readings and the actual temperature
value. This is specially the case when the sensor
is in use to extrapolate hotspots.

This patch adds slope and offset constants for
single sensor linear extrapolation equation. Because
the same sensor can be use in different locations,
from board to board, these constants are added
as part of thermal_zone_params.

The constants are available through sysfs.

It is up to the device driver to determine
the usage of these values.

Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-11 19:46:52 -07:00
Javi Merino 9f38271c6f thermal: export thermal_zone_parameters to sysfs
It's useful for tuning to be able to edit thermal_zone_parameters from
userspace.  Export them to the thermal_zone sysfs so that they can be
easily changed.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:54 -07:00
Punit Agrawal 35e946447f thermal: core: Add Kconfig option to enable writable trips
Add a Kconfig option to allow system integrators to control whether
userspace tools can change trip temperatures. This option overrides
the thermal zone setup in the driver code and must be enabled for
platform specified writable trips to come into effect.

The original behaviour of requiring root privileges to change trip
temperatures remains unchanged.

Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:53 -07:00
Javi Merino 6b775e870c thermal: introduce the Power Allocator governor
The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on "power actors", entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:52 -07:00
Javi Merino 35b11d2e3a thermal: extend the cooling device API to include power information
Add three optional callbacks to the cooling device interface to allow
them to express power.  In addition to the callbacks, add helpers to
identify cooling devices that implement the power cooling device API.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:52 -07:00
Javi Merino e33df1d2f3 thermal: let governors have private data for each thermal zone
A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:52 -07:00
Javi Merino db91651311 thermal: export weight to sysfs
It's useful to have access to the weights for the cooling devices for
thermal zones and change them if needed.  Export them to sysfs.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:51 -07:00
Kapileshwar Singh 6cd9e9f629 thermal: of: fix cooling device weights in device tree
Currently you can specify the weight of the cooling device in the device
tree but that information is not populated to the
thermal_bind_params where the fair share governor expects it to
be.  The of thermal zone device doesn't have a thermal_bind_params
structure and arguably it's better to pass the weight inside the
thermal_instance as it is specific to the bind of a cooling device to a
thermal zone parameter.

Core thermal code is fixed to populate the weight in the instance from
the thermal_bind_params, so platform code that was passing the weight
inside the thermal_bind_params continue to work seamlessly.

While we are at it, create a default value for the weight parameter for
those thermal zones that currently don't define it and remove the
hardcoded default in of-thermal.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Peter Feuerer <peter@piie.net>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Kapileshwar Singh <kapileshwar.singh@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:50 -07:00
Hans de Goede 7e497a7375 thermal: Do not log an error if thermal_zone_get_temp returns -EAGAIN
Some temperature sensors only get updated every few seconds and while
waiting for the first irq reporting a (new) temperature to happen there
get_temp operand will return -EAGAIN as it does not have any data to report
yet.

Not logging an error in this case avoids messages like these from showing
up in dmesg on affected systems:

[    1.219353] thermal thermal_zone0: failed to read out thermal zone 0
[    2.015433] thermal thermal_zone0: failed to read out thermal zone 0
[    2.416737] thermal thermal_zone0: failed to read out thermal zone 0

Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-04-07 13:11:29 -07:00
Matthias Kaehlcke 2dc10f8963 thermal: Make sysfs attributes of cooling devices default attributes
Default attributes are created when the device is registered. Attributes
created after device registration can lead to race conditions, where user space
(e.g. udev) sees the device but not the attributes.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-03-05 01:47:57 -04:00
Johannes Berg 053c095a82 netlink: make nlmsg_end() and genlmsg_end() void
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

  if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

  return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

  if (my_function(...))
    /* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

-	return nlmsg_end(...);
+	nlmsg_end(...);
+	return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 01:03:45 -05:00
Zhang Rui 32c9edc4e3 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal into thermal-soc 2014-12-21 22:49:12 +08:00
Lukasz Majewski 9a3031dc3e thermal:core:fix: Check return code of the ->get_max_state() callback
The return code from ->get_max_state() callback was not checked during
binding cooling device to thermal zone device.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2014-12-08 21:32:56 -04:00
Luis Henriques 9d367e5e7b thermal: Fix error path in thermal_init()
thermal_unregister_governors() and class_unregister() were being called in
the wrong order.

Fixes: 80a26a5c22 ("Thermal: build thermal governors into thermal_sys module")
Cc: stable@vger.kernel.org
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-12-08 12:17:25 +08:00
Javi Merino b6cc772f64 thermal: lock the thermal zone when switching governors
Currently, userspace can request a governor change while the governor
itself is running.  Grab the thermal zone lock when changing the
governor to prevent this race.

Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-12-08 12:10:44 +08:00
Srinivas Pandruvada 84ffe3ecc2 thermal: core: ignore invalid trip temperature
Ignore invalid trip temperature less or equal to zero. Some
buggy systems have invalid trips, causing system shutdown.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-12-08 11:51:46 +08:00
Yao Dongdong 1401586056 Thermal:Remove usless if(!result) before return tz
result is always zero when comes here.

Signed-off-by: Yao Dongdong <yaodongdong@huawei.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2014-11-03 18:59:50 -04:00
Linus Torvalds 8264fce6de Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
 "Sorry that I missed the merge window as there is a bug found in the
  last minute, and I have to fix it and wait for the code to be tested
  in linux-next tree for a few days.  Now the buggy patch has been
  dropped entirely from my next branch.  Thus I hope those changes can
  still be merged in 3.18-rc2 as most of them are platform thermal
  driver changes.

  Specifics:

   - introduce ACPI INT340X thermal drivers.

     Newer laptops and tablets may have thermal sensors and other
     devices with thermal control capabilities that are exposed for the
     OS to use via the ACPI INT340x device objects.  Several drivers are
     introduced to expose the temperature information and cooling
     ability from these objects to user-space via the normal thermal
     framework.

     From: Lu Aaron, Lan Tianyu, Jacob Pan and Zhang Rui.

   - introduce a new thermal governor, which just uses a hysteresis to
     switch abruptly on/off a cooling device.  This governor can be used
     to control certain fan devices that can not be throttled but just
     switched on or off.  From: Peter Feuerer.

   - introduce support for some new thermal interrupt functions on
     i.MX6SX, in IMX thermal driver.  From: Anson, Huang.

   - introduce tracing support on thermal framework.  From: Punit
     Agrawal.

   - small fixes in OF thermal and thermal step_wise governor"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits)
  Thermal: int340x thermal: select ACPI fan driver
  Thermal: int3400_thermal: use acpi_thermal_rel parsing APIs
  Thermal: int340x_thermal: expose acpi thermal relationship tables
  Thermal: introduce int3403 thermal driver
  Thermal: introduce INT3402 thermal driver
  Thermal: move the KELVIN_TO_MILLICELSIUS macro to thermal.h
  ACPI / Fan: support INT3404 thermal device
  ACPI / Fan: add ACPI 4.0 style fan support
  ACPI / fan: convert to platform driver
  ACPI / fan: use acpi_device_xxx_power instead of acpi_bus equivelant
  ACPI / fan: remove no need check for device pointer
  ACPI / fan: remove unused macro
  Thermal: int3400 thermal: register to thermal framework
  Thermal: int3400 thermal: add capability to detect supporting UUIDs
  Thermal: introduce int3400 thermal driver
  ACPI: add ACPI_TYPE_LOCAL_REFERENCE support to acpi_extract_package()
  ACPI: make acpi_create_platform_device() an external API
  thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
  ACPI: introduce ACPI int340x thermal scan handler
  thermal: Added Bang-bang thermal governor
  ...
2014-10-24 11:21:43 -07:00
Rasmus Villemoes 484ac2f32d thermal: replace strnicmp with strncasecmp
The kernel used to contain two functions for length-delimited,
case-insensitive string comparison, strnicmp with correct semantics and
a slightly buggy strncasecmp.  The latter is the POSIX name, so strnicmp
was renamed to strncasecmp, and strnicmp made into a wrapper for the new
strncasecmp to avoid breaking existing users.

To allow the compat wrapper strnicmp to be removed at some point in the
future, and to avoid the extra indirection cost, do
s/strnicmp/strncasecmp/g.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:25 +02:00
Zhang Rui dd63466679 Merge branches 'eduardo-soc' and 'bang-bang-governor' of .git into next 2014-09-18 14:48:40 +08:00
Peter Feuerer e4dbf98f7f thermal: Added Bang-bang thermal governor
The bang-bang thermal governor uses a hysteresis to switch abruptly on
or off a cooling device.  It is intended to control fans, which can
not be throttled but just switched on or off.
Bang-bang cannot be set as default governor as it is intended for
special devices only.  For those special devices the driver needs to
explicitely request it.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Andreas Mohr <andi@lisas.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Javi Merino <javi.merino@arm.com>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Peter Feuerer <peter@piie.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-08-27 15:45:58 +08:00