Model Identification
Cubist is a rule-based model that is an extension of Quinlan’s M5 model tree. A tree is grown where the terminal leaves contain linear regression models. These models are based on the predictors used in previous splits. This soft sensor ML (version V.1.0) has the ID 0003[R]penicillinCUBIST. Its authors are Acosta-Pavas, J. C., Robles-Rodriguez, C. E., Griol, D., Daboussi, F., Aceves-Lara, C. A., & Corrales, D. C., and it is associated with the publication available at the DOI link: https://doi.org/10.1016/j.compchemeng.2024.108736 The model was created on 22-01-2024, belongs to a project described in Industrial-scale penicillin simulation, and its current status is online, meaning it is loaded and ready to generate predictions.
Model Description
The model uses a CUBIST learner and is classified as interpretable. It is implemented in R 4.3.3, and the corresponding model file is 0003_[R]_penicillin_CUBIST.rds.
The implementation relies on:
- Package: Cubist
- Version: 0.4.4
Model summary: The Cubist learner is an advanced version of M5 that explores nonlinear relationships in observed data
Input Time Interval:
- One measurement is expected every 12 minutes.
- No aggregation method is applied (set to "NaN").
Training Information
The training dataset contains 89,800 instances.
Hyperparameters used:
- Minimum number of instances per leaf: 12,000
- Number of committees: 1
- Instance-based corrections: 3
The model was validated using 10-fold cross-validation with 3 repetitions. It was trained using experiments with the following IDs: [1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 17, 19, 20, 21, 22, 24, 25, 26, 27, 29, 30, 31, 33, 34, 36, 37, 38, 39, 40, 42, 43, 44, 45, 47, 48, 49, 50, 51, 53, 54, 55, 56, 57, 58, 59, 61, 62, 65, 66, 67, 69, 70, 71, 72, 73, 74, 75, 76, 77, 80, 81, 82, 83, 84, 85, 87, 88, 89, 90, 92, 93, 94, 95, 96, 97, 98, 99].
Model Inputs
The model receives a set of sensor measurements, actuator settings, and computed variables, each with defined units, expected ranges, and no feature scaling applied.
Features
| Name | Type | Description | Units | Lag | Scaling | Expected Min | Expected Max |
|---|---|---|---|---|---|---|---|
| temperature | Sensor | Current temperature (T) in the bioreactor | K | 0 | none | 298 | 308 |
| pH | Sensor | Current pH level in the bioreactor | pH | 0 | none | 5.5 | 7.5 |
| dissolved_oxygen_concentration | Sensor | Dissolved oxygen (DO) concentration | mg/L | 0 | none | 0 | 10 |
| agitator | Actuator | Agitation speed (rpm) | rpm | 0 | none | 100 | 1200 |
| CO2_percent_in_off_gas | Sensor | CO₂ percentage in off-gas (CO₂,og) | % | 0 | none | 0 | 10 |
| oxygen_in_percent_in_off_gas | Sensor | O₂ percentage in off-gas (O₂,og) | % | 0 | none | 10 | 21 |
| vessel_volume | Computed variable | Total vessel volume (V) | L | 0 | none | 1 | 1000 |
| sugar_feed_rate | Actuator | Sugar feed rate (Fs) into the bioreactor | L/h | 0 | none | 0 | 2 |
Model Output
The model predicts:
| Name | Description | Units | Forecast Horizon | Scaling | Expected Min | Expected Max |
|---|---|---|---|---|---|---|
| penicillin_concentration | Prediction of the penicillin concentration | g L⁻¹ | 0 | none | 0 | 50 |
SEEK ID: https://ibisbahub.eu/models/25?version=2
1 item (and an image) are associated with this Model:- 0003_[R]_penicillin_CUBIST.rds (Gzip archive - 1.58 MB)
Organism: Penicillium chrysogenum
Model type: AI/ML
Model format: R code
Execution or visualisation environment: R
Model image: (Click on the image to zoom) (Original)
Creators and SubmitterCreator
Submitter
Views: 64 Downloads: 1
Created: 15th Dec 2025 at 10:27
Last updated: 15th Dec 2025 at 12:00
AttributionsNone
Version History
Version 2 (latest) Created 15th Dec 2025 at 11:15 by David Camilo Corrales
No revision comments
Version 1 (earliest) Created 15th Dec 2025 at 10:27 by David Camilo Corrales
No revision comments
Download
https://orcid.org/0000-0003-4717-3040