feat(fancontrol): Implement fan control system#1954
feat(fancontrol): Implement fan control system#1954ruaan-deysel wants to merge 2 commits intounraid:mainfrom
Conversation
…gement - Added FanSafetyService to manage fan states and emergency protocols. - Introduced FanControlConfigService for handling fan control configurations. - Created models for fan control configuration, including safety parameters. - Developed resolver and input types for fan control operations. - Implemented fan control metrics and summary calculations. - Added unit tests for fan control service and related functionalities.
WalkthroughThis PR introduces a comprehensive fan control system with hardware monitoring via hwmon and IPMI providers, fan curve interpolation service, safety validation service, GraphQL models and resolvers, configuration management, and integration into the metrics resolver with pubsub support. Version updated to 4.31.1. Changes
Sequence DiagramsequenceDiagram
participant Client
participant FanCurveService
participant TemperatureService
participant HwmonService/IpmiFanService
participant FanSafetyService
Client->>FanCurveService: start(zones)
FanCurveService->>FanCurveService: setInterval(applyCurves, 2000ms)
loop Every polling interval
FanCurveService->>TemperatureService: getMetrics()
TemperatureService-->>FanCurveService: current temps
FanCurveService->>HwmonService/IpmiFanService: readAll()
HwmonService/IpmiFanService-->>FanCurveService: RawFanReadings
FanCurveService->>FanCurveService: For each zone: interpolateSpeed(curve, temp)
FanCurveService->>FanCurveService: Convert speed 0-100 to PWM 0-255
FanCurveService->>FanSafetyService: captureState(fanId, devicePath, pwmNumber)
FanSafetyService-->>FanCurveService: state captured
FanCurveService->>FanSafetyService: validatePwmValue(pwmValue)
FanSafetyService-->>FanCurveService: validated PWM
FanCurveService->>HwmonService/IpmiFanService: setPwm(devicePath, pwmNumber, value)
HwmonService/IpmiFanService-->>FanCurveService: PWM set
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 538bc27632
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| }; | ||
| } | ||
|
|
||
| this.configService.replaceConfig(updated); |
There was a problem hiding this comment.
Wire fan curve engine to config updates
This mutation only persists the config and returns success, but nothing in this commit ever invokes FanCurveService.start()/stop(). As a result, enabling fan control or changing fan-control settings has no runtime effect on automatic curves, so users can save settings that never get applied to hardware.
Useful? React with 👍 / 👎.
| const readings = await this.hwmonService.readAll(); | ||
| for (const reading of readings) { | ||
| if (reading.hasPwmControl) { | ||
| try { |
There was a problem hiding this comment.
Capture fan state before emergency full-speed override
In emergency mode, the code immediately forces all PWM fans to manual/full speed without recording their previous state. restoreAllFans() only restores entries in originalStates, so fans that were not previously captured via normal control paths cannot be restored after an emergency and can remain stuck in forced manual mode.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds a new fan control/monitoring subsystem to the Unraid API Metrics GraphQL surface, including polling + subscription publishing, sysfs hwmon reading, and initial mutation endpoints for PWM/mode changes.
Changes:
- Introduces
FanControlModulewith hwmon + IPMI providers, config persistence, safety helpers, and curve interpolation scaffolding. - Extends
Metrics/MetricsResolverto exposefanControlmetrics and asystemMetricsFanControlsubscription (via newFAN_METRICSpubsub channel). - Updates unit/integration tests to wire the new services into existing metrics/temperature test harnesses.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/unraid-shared/src/pubsub/graphql.pubsub.ts | Adds FAN_METRICS pubsub channel enum value for GraphQL subscriptions. |
| api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.resolver.integration.spec.ts | Updates test DI graph to include fan control service/config providers. |
| api/src/unraid-api/graph/resolvers/metrics/metrics.resolver.ts | Registers fan polling + publishes systemMetricsFanControl; adds fanControl field + subscription. |
| api/src/unraid-api/graph/resolvers/metrics/metrics.resolver.spec.ts | Updates unit tests for new resolver dependencies and topic count. |
| api/src/unraid-api/graph/resolvers/metrics/metrics.resolver.integration.spec.ts | Updates integration test wiring for new resolver dependencies. |
| api/src/unraid-api/graph/resolvers/metrics/metrics.module.ts | Imports FanControlModule into MetricsModule. |
| api/src/unraid-api/graph/resolvers/metrics/metrics.model.ts | Adds fanControl field to the Metrics GraphQL type. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.module.ts | Registers fan control services (config, safety, curve, controllers, resolver). |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.service.ts | Aggregates fan readings from available providers and builds summary metrics (with caching). |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.service.spec.ts | Unit tests for provider selection, PWM percentage math, detection flags, caching, and mapping helpers. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.resolver.ts | Adds mutations for setFanSpeed, setFanMode, restoreAllFans, and config update. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.model.ts | Defines GraphQL models/enums for fan metrics, fans, profiles, and curve inputs. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.input.ts | Defines mutation input types for speed/mode/profile assignment. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol-config.service.ts | Adds persistent config service for fancontrol.json with defaults. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol-config.model.ts | Defines config object/input types (safety/profile/zone types). |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-safety.service.ts | Adds safety clamping, state capture/restore, and emergency full-speed helper. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-safety.service.spec.ts | Unit tests for safety clamping, emergency mode behavior, capture/restore. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.ts | Adds curve interpolation and interval-based curve application scaffolding. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.spec.ts | Unit tests for curve interpolation behavior. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/controller.interface.ts | Defines provider interface + helper mappings for pwm enable/mode. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/hwmon.service.ts | Implements sysfs hwmon device discovery, reading, and sysfs PWM writes. |
| api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/ipmi_fan.service.ts | Implements IPMI sensor reading (and stubs for control via raw commands). |
| api/dev/configs/api.json | Updates dev config version and plugin list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| throw new Error(`Fan ${input.fanId} does not support PWM control`); | ||
| } | ||
|
|
||
| const currentMode = pwmEnableToControlMode(fan.pwmEnable); |
There was a problem hiding this comment.
SetFanModeInput uses the FanControlMode enum (which includes OFF), but FanSafetyService.validateModeTransition() always rejects OFF, making this an API option that can never succeed. Consider removing OFF from the mutation input (separate enum) or allowing a safe OFF transition (and documenting/implementing the expected hwmon behavior).
| const currentMode = pwmEnableToControlMode(fan.pwmEnable); | |
| const currentMode = pwmEnableToControlMode(fan.pwmEnable); | |
| if (input.mode === FanControlMode.OFF) { | |
| throw new Error( | |
| 'Turning fans fully off is not supported by the safety system. ' + | |
| 'Please use MANUAL, AUTOMATIC, or FIXED modes instead.' | |
| ); | |
| } |
| @@ -0,0 +1,46 @@ | |||
| import { Field, Float, InputType, Int } from '@nestjs/graphql'; | |||
There was a problem hiding this comment.
Unused import: Float is imported from @nestjs/graphql but not referenced in this file. This will typically fail linting (no-unused-vars) and should be removed.
| import { Field, Float, InputType, Int } from '@nestjs/graphql'; | |
| import { Field, InputType, Int } from '@nestjs/graphql'; |
| @Field(() => Int, { description: 'PWM value (0-255)' }) | ||
| @IsNumber() | ||
| pwmValue!: number; |
There was a problem hiding this comment.
pwmValue is documented as 0-255, but the input type doesn’t enforce that range (or integer-ness). Since this value directly drives hardware writes (after safety clamping), it would be better to validate at the API boundary (e.g., @Min(0), @Max(255), @IsInt) so callers get a clear validation error instead of silent clamping.
| .map((f) => f.name); | ||
|
|
||
| return { | ||
| totalFans: fans.length, |
There was a problem hiding this comment.
FanControlSummary.totalFans is described in the GraphQL schema as “Total number of fans detected”, but here it’s set to fans.length (which includes detected=false / disconnected fans). Either change this to detectedFans.length or update the schema/field description to reflect the actual meaning.
| totalFans: fans.length, | |
| totalFans: detectedFans.length, |
| validatePwmValue(value: number): number { | ||
| const config = this.configService.getConfig(); | ||
| const minPercent = config.safety?.min_speed_percent ?? 20; | ||
| const minPwm = Math.round((minPercent / 100) * 255); | ||
| return Math.max(minPwm, Math.min(255, Math.round(value))); | ||
| } | ||
|
|
||
| validateCpuFanPwm(value: number): number { | ||
| const config = this.configService.getConfig(); | ||
| const minPercent = config.safety?.cpu_min_speed_percent ?? 30; | ||
| const minPwm = Math.round((minPercent / 100) * 255); | ||
| return Math.max(minPwm, Math.min(255, Math.round(value))); | ||
| } |
There was a problem hiding this comment.
The config includes safety settings like max_temp_before_full and fan_failure_threshold, but this service never reads/uses them, so they currently have no effect. Given the PR description calls out critical-temp overrides and fan-failure detection, consider either implementing those checks (e.g., trigger emergencyFullSpeed() / mark failing fans when RPM drops to 0 with PWM > 0) or removing the unused config options to avoid a false sense of safety.
| @Injectable() | ||
| export class FanSafetyService { | ||
| private readonly logger = new Logger(FanSafetyService.name); | ||
| private originalStates = new Map<string, OriginalState>(); | ||
| private isEmergencyMode = false; | ||
|
|
||
| constructor( | ||
| private readonly hwmonService: HwmonService, | ||
| private readonly configService: FanControlConfigService | ||
| ) {} |
There was a problem hiding this comment.
FanSafetyService captures original PWM/mode state, but nothing calls restoreAllFans() automatically on shutdown/crash (no OnModuleDestroy hook or similar). This means a manual PWM change could persist after the API stops, which conflicts with the “fallback to hardware defaults on service failure” behavior described in the PR. Consider restoring captured states in an onModuleDestroy/shutdown handler (best-effort) to reduce operational risk.
| if (this.isApplyingCurves) { | ||
| return; | ||
| } | ||
| await this.applyCurves(); |
There was a problem hiding this comment.
The setInterval callback is async and awaits applyCurves() without handling rejections. If applyCurves() throws, this will create an unhandled promise rejection (and can crash the process depending on Node/unhandledRejection settings). Wrap the call in a try/catch (or .catch(...)) and log the error so the interval keeps running safely.
| await this.applyCurves(); | |
| try { | |
| await this.applyCurves(); | |
| } catch (error) { | |
| this.logger.error( | |
| 'Error applying fan curves in interval loop', | |
| error | |
| ); | |
| } |
| const metrics = Object.assign(new FanControlMetrics(), { | ||
| id: 'fanControl', | ||
| fans, | ||
| profiles: [], | ||
| summary, | ||
| }); |
There was a problem hiding this comment.
profiles is always returned as an empty list, and there’s no wiring that applies/activates profiles or starts FanCurveService. This doesn’t match the PR description (profiles + automatic curves + related mutations), and it makes the profiles/activeProfile fields misleading for API consumers. Either implement the profile/curve plumbing or trim the schema/description to what’s actually supported in this PR.
|
|
||
| @ObjectType({ implements: () => Node }) | ||
| export class FanControlMetrics extends Node { | ||
| @Field(() => [Fan], { description: 'All detected fans' }) |
There was a problem hiding this comment.
FanControlMetrics.fans is described as “All detected fans”, but FanControlService includes entries with detected=false (RPM 0) in the list. Either filter out undetected fans before returning, or adjust this field description so clients don’t assume every returned fan is physically present.
| @Field(() => [Fan], { description: 'All detected fans' }) | |
| @Field(() => [Fan], { | |
| description: | |
| 'All fans reported by the fan control service, including entries that may be undetected or have zero RPM', | |
| }) |
| @ObjectType() | ||
| export class FanProfileConfig { | ||
| @Field(() => String, { nullable: true }) | ||
| @IsString() | ||
| @IsOptional() | ||
| description?: string; | ||
|
|
||
| @Field(() => [FanCurvePointConfig]) | ||
| @ValidateNested({ each: true }) | ||
| @Type(() => FanCurvePointConfig) | ||
| curve!: FanCurvePointConfig[]; | ||
| } | ||
|
|
||
| @ObjectType() | ||
| export class FanZoneConfig { | ||
| @Field(() => [String]) | ||
| @IsString({ each: true }) | ||
| fans!: string[]; | ||
|
|
||
| @Field(() => String) | ||
| @IsString() | ||
| sensor!: string; | ||
|
|
||
| @Field(() => String) | ||
| @IsString() | ||
| profile!: string; | ||
| } | ||
|
|
||
| @ObjectType() | ||
| export class FanControlConfig { | ||
| @Field({ nullable: true }) | ||
| @IsBoolean() | ||
| @IsOptional() | ||
| enabled?: boolean; | ||
|
|
||
| @Field({ nullable: true }) | ||
| @IsBoolean() | ||
| @IsOptional() | ||
| control_enabled?: boolean; | ||
|
|
||
| @Field(() => Int, { nullable: true }) | ||
| @IsNumber() | ||
| @IsOptional() | ||
| polling_interval?: number; | ||
|
|
||
| @Field(() => String, { nullable: true }) | ||
| @IsString() | ||
| @IsOptional() | ||
| control_method?: string; | ||
|
|
||
| @Field(() => FanControlSafetyConfig, { nullable: true }) | ||
| @ValidateNested() | ||
| @Type(() => FanControlSafetyConfig) | ||
| @IsOptional() | ||
| safety?: FanControlSafetyConfig; | ||
| } |
There was a problem hiding this comment.
FanProfileConfig / FanZoneConfig types exist (and FanCurveService expects zones), but FanControlConfig doesn’t include any fields to persist zones/profiles. As written, there’s no way to configure automatic curves via config, which conflicts with the PR description’s “persistent fan profiles” / “temperature-based curves”. Consider adding these fields to FanControlConfig (and the update input) or removing the unused types until they’re supported.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (9)
api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.spec.ts (1)
6-6: Consider instantiating the service instead of extracting prototype method.Extracting
FanCurveService.prototype.interpolateSpeeddirectly works because the method is currently a pure function that doesn't accessthis. However, this pattern is fragile—if the method ever needs instance state (e.g., for logging or accessing injected dependencies), these tests will silently break.♻️ Safer alternative using service instantiation
-const interpolateSpeed = FanCurveService.prototype.interpolateSpeed; +const service = new FanCurveService(); describe('Fan Curve Interpolation', () => { // ... it('should return minimum speed below lowest temp', () => { - expect(interpolateSpeed(balancedCurve, 20)).toBe(30); + expect(service.interpolateSpeed(balancedCurve, 20)).toBe(30); }); // ... update other calls similarly🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.spec.ts` at line 6, The test currently pulls the method off the prototype via FanCurveService.prototype.interpolateSpeed which is fragile; instead instantiate the service and call the instance method so it stays bound to potential future instance state. Replace usages of FanCurveService.prototype.interpolateSpeed with an instance like new FanCurveService() and invoke its interpolateSpeed (or assign const interpolateSpeed = new FanCurveService().interpolateSpeed.bind(instance) if you need a standalone function) so tests use the service instance and will continue to work if interpolateSpeed later references this or injected dependencies.api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.input.ts (2)
1-1: Remove unusedFloatimport.The
Floatscalar is imported but not used in this file.🧹 Remove unused import
-import { Field, Float, InputType, Int } from '@nestjs/graphql'; +import { Field, InputType, Int } from '@nestjs/graphql';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.input.ts` at line 1, The import list in fancontrol.input.ts includes an unused symbol Float; update the import declaration that currently imports { Field, Float, InputType, Int } to remove Float so it only imports the used symbols (Field, InputType, Int) and then run type-check/lint to confirm no unused-import warnings remain.
13-15: Consider adding range validation for PWM value.The description states "PWM value (0-255)" but there's no
@Min(0)/@Max(255)validation. While the safety service may clamp values, validating at the input layer provides earlier feedback and clearer error messages.✨ Add range validators
+import { IsEnum, IsNumber, IsOptional, IsString, Max, Min } from 'class-validator'; ... `@Field`(() => Int, { description: 'PWM value (0-255)' }) `@IsNumber`() + `@Min`(0) + `@Max`(255) pwmValue!: number;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.input.ts` around lines 13 - 15, Add explicit range validation to the pwmValue input field: annotate the pwmValue property (the field decorated with `@Field`(() => Int, { description: 'PWM value (0-255)' }) and `@IsNumber`()) with `@Min`(0) and `@Max`(255) from class-validator so requests outside 0–255 are rejected with a clear validation error rather than relying only on downstream clamping.api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.service.ts (1)
141-156: Consider documenting the fan type inference heuristic.The assumption that
fanNumber === 1indicates a CPU fan may not hold true for all hardware configurations. This is a reasonable default, but users should be aware this is heuristic-based.Consider adding a brief comment or allowing user override via configuration in a future iteration:
private inferFanType(name: string, fanNumber: number): FanType { const lower = name.toLowerCase(); + // Heuristic: Fan 1 is typically the CPU fan on most motherboards if (lower.includes('cpu') || fanNumber === 1) { return FanType.CPU; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.service.ts` around lines 141 - 156, Add a short doc comment above the inferFanType function explaining that the mapping is heuristic-based (e.g., lowercased name checks and the special-case fanNumber === 1 assumed as CPU) and may not be accurate for all hardware; mention that FanType.CUSTOM is the fallback and add a TODO note to support a user override/configuration in the future so callers can supply explicit mappings if needed. Ensure the comment references the function name inferFanType and the FanType enum so readers can locate the logic and future work.api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/hwmon.service.ts (2)
117-125: Non-null assertions on regex match results could throw.Lines 119 and 124 use
!non-null assertions on regex match results. While the filter ensures the regex matches, TypeScript can't verify this, and future refactoring could break the assumption.♻️ Safer pattern with optional chaining
const fans = files .filter((f) => /^fan\d+_input$/.test(f)) - .map((f) => parseInt(f.match(/^fan(\d+)_input$/)![1], 10)) + .map((f) => { + const match = f.match(/^fan(\d+)_input$/); + return match ? parseInt(match[1], 10) : 0; + }) + .filter((n) => n > 0) .sort((a, b) => a - b); const pwms = files .filter((f) => /^pwm\d+$/.test(f)) - .map((f) => parseInt(f.match(/^pwm(\d+)$/)![1], 10)) + .map((f) => { + const match = f.match(/^pwm(\d+)$/); + return match ? parseInt(match[1], 10) : 0; + }) + .filter((n) => n > 0) .sort((a, b) => a - b);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/hwmon.service.ts` around lines 117 - 125, The code for computing fans and pwms uses non-null assertions on regex match results (in the fans and pwms mapping) which can throw; change the pattern to safely handle potential nulls by capturing the match first and only mapping when match is truthy (e.g., use files.filter(...).map(f => { const m = f.match(...); return m ? parseInt(m[1], 10) : undefined }).filter(Boolean) or use flatMap with a conditional match) so that parseInt is only called with a valid match; update the fans and pwms computations (the variables named fans and pwms) to perform a null-check on the match instead of using `!` and keep the final .sort((a,b)=>a-b).
151-158: Silent error handling in readSysfsInt may mask hardware issues.Returning 0 on any error (line 156) doesn't distinguish between "file doesn't exist" and "permission denied" or "hardware error". For fans, RPM=0 could be misinterpreted as a stopped fan rather than a read failure. Consider logging at debug level for unexpected errors.
💡 Add debug logging for read failures
private async readSysfsInt(devicePath: string, filename: string): Promise<number> { try { const content = await readFile(join(devicePath, filename), 'utf-8'); return parseInt(content.trim(), 10); - } catch { + } catch (err) { + // ENOENT is expected for optional files; log other errors + if ((err as NodeJS.ErrnoException).code !== 'ENOENT') { + this.logger.debug(`Failed to read ${filename} from ${devicePath}: ${err}`); + } return 0; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/hwmon.service.ts` around lines 151 - 158, The readSysfsInt method currently swallows all errors and returns 0, which can hide hardware or permission issues; update the catch to capture the thrown error and emit a debug-level log that includes devicePath, filename and the error details (message/stack) using the file's existing logger (or console.debug if no logger exists) before returning 0 so RPM=0 can be distinguished from a read failure for troubleshooting.api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.resolver.ts (1)
156-169: TheOFFcase in modeToEnable is unreachable due to upstream validation.
validateModeTransition()(per Context snippet 3 atfan-safety.service.ts:104-113) always returnsfalseforFanControlMode.OFF, blocking transitions beforemodeToEnableis ever called with that value. Consider removing theOFFcase or adding a defensive comment explaining it exists for completeness.💡 Optional: Add comment clarifying dead code path
private modeToEnable(mode: FanControlMode): number { switch (mode) { case FanControlMode.MANUAL: return 1; case FanControlMode.AUTOMATIC: return 2; case FanControlMode.FIXED: return 1; case FanControlMode.OFF: + // Unreachable: validateModeTransition blocks OFF transitions return 0; default: return 2; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.resolver.ts` around lines 156 - 169, modeToEnable contains an unreachable FanControlMode.OFF branch because validateModeTransition (see fan-safety.service.ts validateModeTransition) blocks OFF transitions earlier; remove the OFF case from modeToEnable or keep it but add a one-line defensive comment above the switch noting OFF is unreachable due to upstream validation (reference function modeToEnable and enum FanControlMode) so future maintainers understand it’s intentionally redundant.api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/ipmi_fan.service.ts (1)
38-57: Redundant clamping after validation.Line 39-41 validates the value is in 0-255 range, then line 42 clamps again with
Math.max(0, Math.min(255, value)). The clamping is redundant since validation already ensures bounds.♻️ Simplify by removing redundant clamping
async setPwm(devicePath: string, pwmNumber: number, value: number): Promise<void> { if (!Number.isFinite(value) || value < 0 || value > 255) { throw new Error(`Invalid PWM value: ${value}. Must be a number between 0 and 255.`); } - const percent = Math.round((Math.max(0, Math.min(255, value)) / 255) * 100); + const percent = Math.round((value / 255) * 100); try {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/ipmi_fan.service.ts` around lines 38 - 57, In setPwm, remove the redundant clamping and use the already-validated value when computing percent: after the existing Number.isFinite/value-range check in setPwm, replace Math.max(0, Math.min(255, value)) with value so percent is calculated as Math.round((value / 255) * 100); keep the validation/throw as-is and preserve the execa call, timeout handling, logger.debug and error handling in the existing setPwm method.api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.ts (1)
162-167: Silent skip when fan not found in zone configuration.When
readings.find((r) => r.id === fanId)returns undefined (line 164), the code silently continues. This could mask configuration errors where zone.fans contains invalid fan IDs. Consider logging at debug level when a configured fan isn't found.💡 Add debug logging for missing fans
const fan = readings.find((r) => r.id === fanId); - if (!fan || !fan.hasPwmControl) { + if (!fan) { + this.logger.debug(`Fan ${fanId} not found in readings, skipping`); + continue; + } + if (!fan.hasPwmControl) { continue; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.ts` around lines 162 - 167, The loop over zone.fans in fan-curve.service.ts silently skips when readings.find((r) => r.id === fanId) returns undefined; add a debug log to surface misconfigured fan IDs: inside the for (const fanId of zone.fans) block, when fan is falsy before the continue, call the module/process logger at debug (or trace) level with a clear message including the zone.id (or zone identifier), the missing fanId and context (e.g., "configured in zone but no reading found"), then continue; keep existing behavior for fan.hasPwmControl but also consider logging when hasPwmControl is false for completeness.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.ts`:
- Around line 106-111: The setInterval async callback can produce unhandled
promise rejections when this.applyCurves() throws; update the callback used to
set this.curveInterval so the await is wrapped in a try/catch (inside the
existing if (this.isApplyingCurves) guard) and any error is handled/logged
(e.g., via this.logger.error or console.error) so rejections are not left
unhandled; reference the existing symbols: this.curveInterval, setInterval(...),
this.isApplyingCurves, and this.applyCurves().
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-safety.service.ts`:
- Around line 59-75: restoreAllFans currently clears originalStates and sets
isEmergencyMode = false even if some hwmonService.restoreAutomatic calls fail;
update restoreAllFans to track per-fan failures (use a local failedIds or
failedStates collection), only clear this.originalStates and set
this.isEmergencyMode = false when all restores succeed, otherwise keep failed
entries in this.originalStates for retry and leave isEmergencyMode true; ensure
failures from hwmonService.restoreAutomatic are logged via this.logger.error (as
already done) and consider returning a boolean or throwing an error to surface
partial failure to callers.
- Around line 25-43: captureState currently calls hwmonService.readAll()
internally which can race with the caller's own snapshot; change
captureState(fanId: string, devicePath: string, pwmNumber: number) to accept an
optional reading parameter (e.g., reading?: HwmonReading) and use that reading
to populate originalStates when provided, falling back to hwmonService.readAll()
only if no reading is passed. Update callers such as
FanCurveService.applyCurves() to pass the reading/snapshot they already obtained
instead of letting captureState re-read, and keep the originalStates map
handling and logger.debug behavior unchanged.
---
Nitpick comments:
In
`@api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/hwmon.service.ts`:
- Around line 117-125: The code for computing fans and pwms uses non-null
assertions on regex match results (in the fans and pwms mapping) which can
throw; change the pattern to safely handle potential nulls by capturing the
match first and only mapping when match is truthy (e.g., use
files.filter(...).map(f => { const m = f.match(...); return m ? parseInt(m[1],
10) : undefined }).filter(Boolean) or use flatMap with a conditional match) so
that parseInt is only called with a valid match; update the fans and pwms
computations (the variables named fans and pwms) to perform a null-check on the
match instead of using `!` and keep the final .sort((a,b)=>a-b).
- Around line 151-158: The readSysfsInt method currently swallows all errors and
returns 0, which can hide hardware or permission issues; update the catch to
capture the thrown error and emit a debug-level log that includes devicePath,
filename and the error details (message/stack) using the file's existing logger
(or console.debug if no logger exists) before returning 0 so RPM=0 can be
distinguished from a read failure for troubleshooting.
In
`@api/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/ipmi_fan.service.ts`:
- Around line 38-57: In setPwm, remove the redundant clamping and use the
already-validated value when computing percent: after the existing
Number.isFinite/value-range check in setPwm, replace Math.max(0, Math.min(255,
value)) with value so percent is calculated as Math.round((value / 255) * 100);
keep the validation/throw as-is and preserve the execa call, timeout handling,
logger.debug and error handling in the existing setPwm method.
In
`@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.spec.ts`:
- Line 6: The test currently pulls the method off the prototype via
FanCurveService.prototype.interpolateSpeed which is fragile; instead instantiate
the service and call the instance method so it stays bound to potential future
instance state. Replace usages of FanCurveService.prototype.interpolateSpeed
with an instance like new FanCurveService() and invoke its interpolateSpeed (or
assign const interpolateSpeed = new
FanCurveService().interpolateSpeed.bind(instance) if you need a standalone
function) so tests use the service instance and will continue to work if
interpolateSpeed later references this or injected dependencies.
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.ts`:
- Around line 162-167: The loop over zone.fans in fan-curve.service.ts silently
skips when readings.find((r) => r.id === fanId) returns undefined; add a debug
log to surface misconfigured fan IDs: inside the for (const fanId of zone.fans)
block, when fan is falsy before the continue, call the module/process logger at
debug (or trace) level with a clear message including the zone.id (or zone
identifier), the missing fanId and context (e.g., "configured in zone but no
reading found"), then continue; keep existing behavior for fan.hasPwmControl but
also consider logging when hasPwmControl is false for completeness.
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.input.ts`:
- Line 1: The import list in fancontrol.input.ts includes an unused symbol
Float; update the import declaration that currently imports { Field, Float,
InputType, Int } to remove Float so it only imports the used symbols (Field,
InputType, Int) and then run type-check/lint to confirm no unused-import
warnings remain.
- Around line 13-15: Add explicit range validation to the pwmValue input field:
annotate the pwmValue property (the field decorated with `@Field`(() => Int, {
description: 'PWM value (0-255)' }) and `@IsNumber`()) with `@Min`(0) and `@Max`(255)
from class-validator so requests outside 0–255 are rejected with a clear
validation error rather than relying only on downstream clamping.
In
`@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.resolver.ts`:
- Around line 156-169: modeToEnable contains an unreachable FanControlMode.OFF
branch because validateModeTransition (see fan-safety.service.ts
validateModeTransition) blocks OFF transitions earlier; remove the OFF case from
modeToEnable or keep it but add a one-line defensive comment above the switch
noting OFF is unreachable due to upstream validation (reference function
modeToEnable and enum FanControlMode) so future maintainers understand it’s
intentionally redundant.
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.service.ts`:
- Around line 141-156: Add a short doc comment above the inferFanType function
explaining that the mapping is heuristic-based (e.g., lowercased name checks and
the special-case fanNumber === 1 assumed as CPU) and may not be accurate for all
hardware; mention that FanType.CUSTOM is the fallback and add a TODO note to
support a user override/configuration in the future so callers can supply
explicit mappings if needed. Ensure the comment references the function name
inferFanType and the FanType enum so readers can locate the logic and future
work.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 6bf90734-d7a1-4a75-ad9f-d0602828b9f9
📒 Files selected for processing (23)
api/dev/configs/api.jsonapi/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/controller.interface.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/hwmon.service.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/controllers/ipmi_fan.service.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.spec.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-safety.service.spec.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-safety.service.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol-config.model.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol-config.service.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.input.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.model.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.module.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.resolver.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.service.spec.tsapi/src/unraid-api/graph/resolvers/metrics/fancontrol/fancontrol.service.tsapi/src/unraid-api/graph/resolvers/metrics/metrics.model.tsapi/src/unraid-api/graph/resolvers/metrics/metrics.module.tsapi/src/unraid-api/graph/resolvers/metrics/metrics.resolver.integration.spec.tsapi/src/unraid-api/graph/resolvers/metrics/metrics.resolver.spec.tsapi/src/unraid-api/graph/resolvers/metrics/metrics.resolver.tsapi/src/unraid-api/graph/resolvers/metrics/temperature/temperature.resolver.integration.spec.tspackages/unraid-shared/src/pubsub/graphql.pubsub.ts
| this.curveInterval = setInterval(async () => { | ||
| if (this.isApplyingCurves) { | ||
| return; | ||
| } | ||
| await this.applyCurves(); | ||
| }, interval); |
There was a problem hiding this comment.
Unhandled promise rejections in setInterval callback.
The async callback in setInterval doesn't catch errors from applyCurves(). If applyCurves() throws (e.g., temperature service fails), the promise rejection is unhandled and may cause Node.js warnings or crashes with --unhandled-rejections=strict.
🛡️ Add error handling to interval callback
this.curveInterval = setInterval(async () => {
if (this.isApplyingCurves) {
return;
}
- await this.applyCurves();
+ try {
+ await this.applyCurves();
+ } catch (err) {
+ this.logger.error(`Error applying fan curves: ${err}`);
+ }
}, interval);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| this.curveInterval = setInterval(async () => { | |
| if (this.isApplyingCurves) { | |
| return; | |
| } | |
| await this.applyCurves(); | |
| }, interval); | |
| this.curveInterval = setInterval(async () => { | |
| if (this.isApplyingCurves) { | |
| return; | |
| } | |
| try { | |
| await this.applyCurves(); | |
| } catch (err) { | |
| this.logger.error(`Error applying fan curves: ${err}`); | |
| } | |
| }, interval); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-curve.service.ts`
around lines 106 - 111, The setInterval async callback can produce unhandled
promise rejections when this.applyCurves() throws; update the callback used to
set this.curveInterval so the await is wrapped in a try/catch (inside the
existing if (this.isApplyingCurves) guard) and any error is handled/logged
(e.g., via this.logger.error or console.error) so rejections are not left
unhandled; reference the existing symbols: this.curveInterval, setInterval(...),
this.isApplyingCurves, and this.applyCurves().
| async captureState(fanId: string, devicePath: string, pwmNumber: number): Promise<void> { | ||
| if (this.originalStates.has(fanId)) { | ||
| return; | ||
| } | ||
|
|
||
| const readings = await this.hwmonService.readAll(); | ||
| const reading = readings.find((r) => r.id === fanId); | ||
| if (reading) { | ||
| this.originalStates.set(fanId, { | ||
| devicePath, | ||
| pwmNumber, | ||
| pwmEnable: reading.pwmEnable, | ||
| pwmValue: reading.pwmValue, | ||
| }); | ||
| this.logger.debug( | ||
| `Captured original state for ${fanId}: enable=${reading.pwmEnable}, pwm=${reading.pwmValue}` | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
Potential race condition: captureState reads sysfs independently from caller's snapshot.
captureState() issues its own hwmonService.readAll() at line 30, which could return different values than the snapshot the caller is operating on (e.g., FanCurveService.applyCurves() reads at line 142, then calls captureState() at line 169). If PWM values change between these reads, the captured "original" state may not match what the caller observed.
Consider accepting the reading as an optional parameter to avoid the redundant read and ensure consistency:
💡 Suggested improvement to avoid race
- async captureState(fanId: string, devicePath: string, pwmNumber: number): Promise<void> {
+ async captureState(
+ fanId: string,
+ devicePath: string,
+ pwmNumber: number,
+ existingReading?: { pwmEnable: number; pwmValue: number }
+ ): Promise<void> {
if (this.originalStates.has(fanId)) {
return;
}
- const readings = await this.hwmonService.readAll();
- const reading = readings.find((r) => r.id === fanId);
+ const reading = existingReading ?? (await this.hwmonService.readAll()).find((r) => r.id === fanId);
if (reading) {
this.originalStates.set(fanId, {
devicePath,
pwmNumber,
pwmEnable: reading.pwmEnable,
pwmValue: reading.pwmValue,
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async captureState(fanId: string, devicePath: string, pwmNumber: number): Promise<void> { | |
| if (this.originalStates.has(fanId)) { | |
| return; | |
| } | |
| const readings = await this.hwmonService.readAll(); | |
| const reading = readings.find((r) => r.id === fanId); | |
| if (reading) { | |
| this.originalStates.set(fanId, { | |
| devicePath, | |
| pwmNumber, | |
| pwmEnable: reading.pwmEnable, | |
| pwmValue: reading.pwmValue, | |
| }); | |
| this.logger.debug( | |
| `Captured original state for ${fanId}: enable=${reading.pwmEnable}, pwm=${reading.pwmValue}` | |
| ); | |
| } | |
| } | |
| async captureState( | |
| fanId: string, | |
| devicePath: string, | |
| pwmNumber: number, | |
| existingReading?: { pwmEnable: number; pwmValue: number } | |
| ): Promise<void> { | |
| if (this.originalStates.has(fanId)) { | |
| return; | |
| } | |
| const reading = existingReading ?? (await this.hwmonService.readAll()).find((r) => r.id === fanId); | |
| if (reading) { | |
| this.originalStates.set(fanId, { | |
| devicePath, | |
| pwmNumber, | |
| pwmEnable: reading.pwmEnable, | |
| pwmValue: reading.pwmValue, | |
| }); | |
| this.logger.debug( | |
| `Captured original state for ${fanId}: enable=${reading.pwmEnable}, pwm=${reading.pwmValue}` | |
| ); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-safety.service.ts`
around lines 25 - 43, captureState currently calls hwmonService.readAll()
internally which can race with the caller's own snapshot; change
captureState(fanId: string, devicePath: string, pwmNumber: number) to accept an
optional reading parameter (e.g., reading?: HwmonReading) and use that reading
to populate originalStates when provided, falling back to hwmonService.readAll()
only if no reading is passed. Update callers such as
FanCurveService.applyCurves() to pass the reading/snapshot they already obtained
instead of letting captureState re-read, and keep the originalStates map
handling and logger.debug behavior unchanged.
| async restoreAllFans(): Promise<void> { | ||
| this.logger.warn('Restoring all fans to original states'); | ||
| for (const [fanId, state] of this.originalStates.entries()) { | ||
| try { | ||
| await this.hwmonService.restoreAutomatic( | ||
| state.devicePath, | ||
| state.pwmNumber, | ||
| state.pwmEnable | ||
| ); | ||
| this.logger.log(`Restored fan ${fanId} to enable=${state.pwmEnable}`); | ||
| } catch (err) { | ||
| this.logger.error(`Failed to restore fan ${fanId}: ${err}`); | ||
| } | ||
| } | ||
| this.originalStates.clear(); | ||
| this.isEmergencyMode = false; | ||
| } |
There was a problem hiding this comment.
restoreAllFans clears emergency mode even when individual restorations fail.
If some fans fail to restore (caught at line 69-71), the method still clears originalStates and exits emergency mode at lines 73-74. This means failed fans won't be retried and the system exits emergency mode despite potentially unsafe fan states.
Consider tracking failures and only clearing emergency mode when all fans restore successfully, or keeping failed fan states for retry:
💡 Suggested approach to handle partial failures
async restoreAllFans(): Promise<void> {
this.logger.warn('Restoring all fans to original states');
+ const failedFans: string[] = [];
for (const [fanId, state] of this.originalStates.entries()) {
try {
await this.hwmonService.restoreAutomatic(
state.devicePath,
state.pwmNumber,
state.pwmEnable
);
this.logger.log(`Restored fan ${fanId} to enable=${state.pwmEnable}`);
+ this.originalStates.delete(fanId);
} catch (err) {
this.logger.error(`Failed to restore fan ${fanId}: ${err}`);
+ failedFans.push(fanId);
}
}
- this.originalStates.clear();
- this.isEmergencyMode = false;
+ if (failedFans.length === 0) {
+ this.isEmergencyMode = false;
+ } else {
+ this.logger.warn(`${failedFans.length} fan(s) failed to restore, staying in emergency mode`);
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async restoreAllFans(): Promise<void> { | |
| this.logger.warn('Restoring all fans to original states'); | |
| for (const [fanId, state] of this.originalStates.entries()) { | |
| try { | |
| await this.hwmonService.restoreAutomatic( | |
| state.devicePath, | |
| state.pwmNumber, | |
| state.pwmEnable | |
| ); | |
| this.logger.log(`Restored fan ${fanId} to enable=${state.pwmEnable}`); | |
| } catch (err) { | |
| this.logger.error(`Failed to restore fan ${fanId}: ${err}`); | |
| } | |
| } | |
| this.originalStates.clear(); | |
| this.isEmergencyMode = false; | |
| } | |
| async restoreAllFans(): Promise<void> { | |
| this.logger.warn('Restoring all fans to original states'); | |
| const failedFans: string[] = []; | |
| for (const [fanId, state] of this.originalStates.entries()) { | |
| try { | |
| await this.hwmonService.restoreAutomatic( | |
| state.devicePath, | |
| state.pwmNumber, | |
| state.pwmEnable | |
| ); | |
| this.logger.log(`Restored fan ${fanId} to enable=${state.pwmEnable}`); | |
| this.originalStates.delete(fanId); | |
| } catch (err) { | |
| this.logger.error(`Failed to restore fan ${fanId}: ${err}`); | |
| failedFans.push(fanId); | |
| } | |
| } | |
| if (failedFans.length === 0) { | |
| this.isEmergencyMode = false; | |
| } else { | |
| this.logger.warn(`${failedFans.length} fan(s) failed to restore, staying in emergency mode`); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/src/unraid-api/graph/resolvers/metrics/fancontrol/fan-safety.service.ts`
around lines 59 - 75, restoreAllFans currently clears originalStates and sets
isEmergencyMode = false even if some hwmonService.restoreAutomatic calls fail;
update restoreAllFans to track per-fan failures (use a local failedIds or
failedStates collection), only clear this.originalStates and set
this.isEmergencyMode = false when all restores succeed, otherwise keep failed
entries in this.originalStates for retry and leave isEmergencyMode true; ensure
failures from hwmonService.restoreAutomatic are logged via this.logger.error (as
already done) and consider returning a boolean or throwing an error to surface
partial failure to callers.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1954 +/- ##
==========================================
+ Coverage 51.73% 51.86% +0.12%
==========================================
Files 1026 1038 +12
Lines 70718 71795 +1077
Branches 7881 8075 +194
==========================================
+ Hits 36586 37234 +648
- Misses 34009 34438 +429
Partials 123 123 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
This Implements a fan control system for the Unraid API, enabling real-time fan monitoring and control via Unraid API.
Closes this open feature bounty #1598
What's Included
Fan Monitoring
Fan Control
setFanSpeed,setFanMode,setFanProfile, andcreateFanProfileSafety
Architecture
MetricsResolveralongside CPU, memory, and temperatureHwmonService(sysfs) andIpmiFanService(IPMI)FanControlConfigServicefor persistent configuration viaConfigFilePersisterFanCurveServicefor temperature-linked speed interpolationFanSafetyServicefor safety guards and failure detectionFiles Changed
api/src/unraid-api/graph/resolvers/metrics/fancontrol/MetricsResolver,MetricsModule, andMetricsmodelFAN_METRICSpubsub channelTesting
Summary by CodeRabbit
Release Notes