About the Project (Detailed Scientific Documentation)
This page documents the scientific idea, the MATLAB-to-web conversion, and provides a detailed explanation of the computational services (ANFIS/XAI) with references.
0) Scientific Idea of the Project
ANFIS + XAIThe project builds a road-maintenance cost estimation system using ANFIS (Adaptive Neuro-Fuzzy Inference System) based on a Sugeno FIS, then adds an explainability (XAI) layer to clarify “why” the model produced a given cost—not only “what” the cost is.
ANFIS was introduced by Jang (1993) as a framework that combines neural networks with fuzzy inference (FIS) using a hybrid learning procedure to construct an input-output mapping from data and if-then rules. Sugeno FIS is well-suited for modeling because it uses functional (often linear) consequents, supporting interpolation in the input space.
1) Detailed Explanation of HTS Service
HTSIn this project, HTS is not a universal standard name; it is an engineering proxy (severity index). The idea is to aggregate the number of extremely hot days (above 40°C) over a time window representing pavement age prior to the maintenance date. Higher long-term thermal exposure increases the likelihood of asphalt distress (e.g., rutting and deformation).
- • pavementAgeCode: pavement age code (1/2/3).
- • maintenanceDate: maintenance date (extract maintenance year).
- • HeatDay table: yearly stored days_above_40.
We compute startYear and then sum days_above_40 from (startYear + 1) to maintenanceYear:
startYear = maintenanceYear - yearsSpan
HTS = Σ_{y = startYear + 1 إلى maintenanceYear} days_above_40(y)
This matches the Laravel logic: whereBetween(year, [startYear+1, maintenanceYear]) then sum(days_above_40).
Because research reports that when ambient temperature exceeds 40°C, pavement surface temperature can reach ~65–75°C, significantly increasing the risk of rutting, deformation, and reduced load-bearing capacity. Therefore, counting “days above 40°C” becomes a meaningful simplified proxy for long-term thermal exposure—especially for Iraq/Karbala climate.
- • Source mentioning 65–75°C surface range when ambient >40°C and rutting risk. — MDPI (2025)
- • evalfis and FIS evaluation concept (HTS later becomes an ANFIS input). — MathWorks evalfis
Research note: HTS here is a designed proxy rather than a universal standard name. The goal is to represent climate severity (heat/time) numerically for the model.
2) Detailed Scientific Explanation of AnfisService (Cost)
ANFISThis service is the core of the project: it replicates the original MATLAB pipeline. In essence: (1) apply input transforms (Log + normalization), (2) evaluate a Sugeno FIS (evalfis concept) to obtain yNorm, (3) reverse normalization to restore the output scale, then apply inverse log to return the final cost in IQD.
We use log10(x + 1) to compress large ranges, reduce skewness, and improve numerical stability before normalization and evaluation—common when inputs span wide numeric ranges.
x_log = log10(x + 1)
Implementation note: log10_safe protects against unexpected non-positive values to avoid runtime failures.
After log transform, we apply the same MATLAB normalization using PS_in. In MATLAB, mapminmax can return process settings (PS) which can be reused to transform new data identically—ensuring the PHP implementation matches MATLAB exactly.
- • PS_in.gain / xoffset / ymin / yrange are used to map each feature to the target range.
- • You stored these values in config(anfis.ps_in.*) for reproducibility.
MathWorks mapminmax documentation explains normalization and returning PS settings for reuse. — MathWorks mapminmax
Here we evaluate the Sugeno FIS: compute membership degrees per rule, compute rule firing strengths using an AND operator (e.g., product), compute consequent outputs (often linear), then aggregate in a Sugeno manner to produce yNorm—conceptually matching MATLAB evalfis.
- • Sugeno FIS documentation. — MathWorks sugfis
- • evalfis documentation (FIS evaluation) as a reference concept. — MathWorks evalfis
After yNorm, we restore the original output scale using PS_out (reverse mapminmax) to get outLog. Then apply inverse log to obtain final cost:
outLog = reverse_mapminmax(yNorm, PS_out)
cost = (10 ^ outLog) - 1
Thus you implemented: Log → mapminmax → Sugeno eval → reverse mapminmax → inverse log, mirroring MATLAB.
Research value: you did not retrain the model on the web; you converted a trained MATLAB model into PHP while preserving the same transforms and parameters (fis + PS settings). This enables an accurate MATLAB vs PHP comparison.
Jang (1993) describes ANFIS architecture, hybrid learning, and mapping construction from training data. — Jang 1993 PDF
3) Detailed Scientific Explanation of CostGaugeService
GaugeAfter obtaining the total cost, we need a fair classification that remains comparable across projects with different areas. Therefore, we use an index A = cost per square meter (IQD/m²). This is a common normalization/parametric indicator for meaningful comparison.
A = TotalCost / Area (IQD per m²)
The service first validates Area > 0, computes A, then matches it against configured ranges in config(cost_gauge.ranges).
- • If A is within [min, max), select that range.
- • If A is below the first range → use the first range.
- • If A exceeds the last range → use the last range.
- • Result returns label_key + label + range_used + full ranges for gauge rendering.
Cost per unit area is commonly used in cost estimation as a parametric normalization metric for comparison. — RAIC CHOP (Cost Planning)
4) Detailed Explanation of SensitivityService (OAT + Perturbation)
XAIThis service produces an “impact percentage” for 7 selected inputs (out of 13). The method is One-At-a-Time (OAT): change one factor at a time and observe the output change. This is a simple, local, user-friendly sensitivity approach.
1) Base cost:
C0 = f(x)
2) For each feature i:
- If continuous: test (xi - 10%) and (xi + 10%)
- If categorical: test all allowed values (except current)
3) Max absolute impact:
Δi = max | f(x_test) - C0 |
4) Impact percentage:
Impact%i = (Δi / C0) * 100
5) Sort descending -> bar chart + top influencers
Finally, results are sorted by impact_percent (DESC). The top_n influencers are used to support the ExplanationService.
- • Clear to end users: “only this factor changed, and the output shifted by X”.
- • Fast and applicable without retraining.
- • Suitable as a local (instance-based) explanation per project.
- • May miss feature interactions because only one factor changes at a time.
- • Depends on the perturbation size (e.g., ±10%).
These limitations are well-known in sensitivity analysis, yet the method remains highly practical for local explainability in a web system.
- • OAT concept and “perturbations typically occur one at a time”. — Razavi et al., 2021 (ScienceDirect)
- • Statement: “each input feature is perturbed one-at-a-time”. — Naik et al., 2021 (Springer)
- • Model-agnostic XAI methods based on perturbation and observing prediction changes. — Wikle et al., 2022 (PMC)
5) Detailed Explanation of ExplanationService (Rule/Template-based NLG)
NLGThis service is the language layer over numbers: it converts computed results into a professional explanation. Scientifically, this is Data-to-Text / Natural Language Generation (NLG). Your approach is rule/template-based: it uses fixed templates plus rule-based selection of reasons from top influencers.
- • Gauge: classification + A index (cost/m²).
- • Sensitivity: top_influencers + debug_payload (direction, delta, ...).
- • inputsAssoc: current input values to describe the current state.
- min_influence_pct: أقل نسبة تأثير لقبول السبب
- max_reasons: الحد الأعلى لعدد الأسباب
- نمر على top_influencers:
إذا pct >= minPct -> نولد سبب (AR + EN)
- إذا ماكو أسباب قوية -> fallback sentence
The service also determines direction (increase/decrease/neutral) using delta = cost_after_change - base_cost, then uses that to craft an engineering sentence explaining whether the current value is cheaper or costlier than alternatives within the tested range.
- • Consistent formal tone (good for academic reporting).
- • Auditable: each sentence maps to specific data (debug payload).
- • Maintainable: changing templates/rules is easier than training an NLG model.
- • Supports bilingual output easily (AR/EN).
- • NLG definition (data-to-text generation). — IBM (NLG)
- • Template-based NLG paper. — IAENG PDF
- • Scientific discussion: template-based NLG is not necessarily inferior. — van Deemter et al. PDF
Implementation & System Conversion (MATLAB → Web Chapter)
Research Chapter- • UI layer: input/results/reports pages + Plotly charts.
- • Service layer: HtsService, AnfisService, CostGaugeService, SensitivityService, ExplanationService.
- • Data layer: Projects/Calculations + HeatDay (HTS) + configs for PS settings.
- • Report layer: PDF generation (summary + reasons + charts).
User Inputs
↓
(1) HTS ← HeatDay(year → days_above_40)
↓
Build ordered13 (V1..V13)
↓
(2) ANFIS predictCost → TotalCost
↓
(3) Gauge A = TotalCost/Area → label/range
↓
(4) Sensitivity OAT → impact% + top influencers
↓
(5) Explanation NLG → summary + reasons
↓
UI Charts + PDF Report
Here you document the MATLAB training: dataset, sample count, train/validation split, membership function choices, rule count, and training error (MSE/RMSE).
Note: ANFIS training follows the hybrid learning described in Jang 1993.
- • Same 13 inputs fed to MATLAB and PHP.
- • Compare final cost after inverse log.
- • Compute absolute/relative error: |PHP − MATLAB| and %Error.
HTS validation can be documented by showing: (1) a maintenance date example, (2) computed startYear per code, (3) included years, (4) the sum of days_above_40, with a screenshot from HeatDay table or query output.
In the report, we will include screenshots of CLI commands used to test services (e.g., anfis:test-gauge) to show stable results for the same inputs.
Explain that classification is based on A (cost/m²), not only TotalCost. Include configured ranges and justify them based on study context.
- • Log: same equation log10(x+1).
- • PS_in: same gain/xoffset/ymin/yrange.
- • fis: same .fis file (Sugeno rules).
- • PS_out: same reverse normalization.
- • Inverse log: (10^outLog) - 1.
These references are not your specific conversion, but official examples of MATLAB web app hosting/concepts that can support the “conversion chapter” context. — MathWorks Web App Server
References
LinksExample Walkthrough (One End-to-End Example)
ExampleThis single example shows the full pipeline: HTS → ANFIS → Gauge → Sensitivity → Explanation. Numbers are used to explain the logic, while the exact ANFIS value depends on your PS settings and FIS file.
Assume a road-maintenance project with a specific maintenance date and a reasonable area, using a “medium” pavement age code. The goal is to show how values flow through the system.
Since pavement age code = 2, the window = 12 years. Maintenance year = 2024.
yearsSpan = 12
maintenanceYear = 2024
startYear = 2024 - 12 = 2012
HTS = sum(days_above_40) from (2013 .. 2024)
Assume the total days above 40°C from 2013 to 2024 in HeatDay equals: 980 days.
This value becomes one of the ANFIS inputs (within ordered13) based on the ordering.
We feed 13 ordered inputs ordered13 (V1..V13). We show a simplified example for three values to illustrate transformations, then explain how the final cost is obtained.
Assume three raw inputs (subset):
V3 (Area) = 12000
V7 (HTS) = 980
V9 (Traffic) = 2 (categorical code as numeric)
Step A) log10(x+1):
log10(12000+1) ≈ 4.079
log10(980+1) ≈ 2.992
log10(2+1) ≈ 0.477
Step B) mapminmax apply (0..1):
xNorm = applyVector(xLog, PS_in) // exact values depend on PS_in in your config
Step C) Sugeno eval:
yNorm = evaluateSugeno(xNorm) // depends on FIS (.fis rules & parameters)
Step D) reverse output normalization:
outLog = reverseScalar(yNorm, PS_out)
Step E) inverse log:
cost = (10^outLog) - 1
Exact xNorm/yNorm/outLog depend on your PS_in/PS_out and FIS file, so this is a methodological illustration rather than your exact final value.
To complete the end-to-end example, assume ANFIS produced the following total cost: 540,000,000 IQD.
We compute A = TotalCost / Area to compare projects fairly.
A = 540,000,000 / 12,000
= 45,000 IQD/m²
Then the service matches A against config(cost_gauge.ranges) to determine the label (low/medium/high...) and range color.
We show two micro-examples within the same pipeline: a continuous feature (±10%) and a categorical feature (test values). We take the maximum change and convert it to a percentage.
BaseCost = 540,000,000. Test Area=10,800 and Area=13,200 and measure changes.
Base cost C0 = 540,000,000
Test low (Area -10%): newCost = 575,000,000 → diff = 35,000,000
Test high (Area +10%): newCost = 510,000,000 → diff = 30,000,000
Δ = max(diff) = 35,000,000
Impact% = (Δ / C0) * 100
= (35,000,000 / 540,000,000) * 100
≈ 6.48%
Assume current RoadType = 2. Test values [1,3] and pick the maximum difference.
Base cost C0 = 540,000,000
Current RoadType = 2
Test RoadType = 1 → newCost = 520,000,000 → diff = 20,000,000
Test RoadType = 3 → newCost = 610,000,000 → diff = 70,000,000
Δ = 70,000,000
Impact% = (70,000,000 / 540,000,000) * 100
≈ 12.96%
These values are illustrative. In the real system, newCost is obtained by calling predictCost after modifying the feature.
Assume the Gauge classified the case as “Medium” and the top two influencers are RoadType (≈12.96%) and Area (≈6.48%). Below shows how the system generates a Summary + Reasons using templates and direction logic.
—
The estimated cost falls in the (Medium) level. The A index is ≈ 45,000 IQD/m², with a total estimated cost of ≈ 540,000,000 IQD. Overall, this classification aligns with how road characteristics, loading, and environment influence intervention depth.
• (Road Type) — approx. impact 12.96%.
Changing this factor increased the cost compared to the current state.
(Δ ≈ 70,000,000 IQD).
• (Area) — approx. impact 6.48%.
Changing this factor affected the cost within the tested ±10% range.
(Δ ≈ 35,000,000 IQD).
Note: Sentences are built from (1) feature labels from config, (2) impact% from Sensitivity, (3) direction from debug_payload, then a fixed template.