WIP, ENH: Add support for gamma scaling and iterative solvers.#70
WIP, ENH: Add support for gamma scaling and iterative solvers.#70
Conversation
src/gfdl/model.py
Outdated
| return out | ||
|
|
||
|
|
||
| class GFDLRegressor(RegressorMixin, MultiOutputMixin, GFDL): |
There was a problem hiding this comment.
Has something changed about GFDLRegressor? The diff seems quite messy here, making it hard to judge what has actually changed and what hasn't. This class was deleted below and then pasted here?
There was a problem hiding this comment.
Yes, the class was at the end of the file, but I moved it after GFDLClassifier to be have similar estimators one after the other. The only other change is the extra argument "gamma" to the constructor and "solver" to fit method.
There was a problem hiding this comment.
It would be best not to reorganize the classes for now I think, so we can focus on what actually changed in the diff.
There was a problem hiding this comment.
Reverted to the old organization.
src/gfdl/model.py
Outdated
| return np.random.default_rng(seed) | ||
|
|
||
|
|
||
| def stochastic_gradient_descent(self, X, y): |
There was a problem hiding this comment.
What's the motivation for rolling our own SGD instead of using upstream SGDRegressor from sklearn, which has learning_rate, eta0, etc.? Does its verbose argument not provide enough output?
You mention it briefly in the matching issue, but my intuition would always be to try the upstream/well-tested solution first before investing time on a hand-rolled version.
src/gfdl/model.py
Outdated
|
|
||
| return self | ||
|
|
||
| def gradient_descent(self, X, y): |
There was a problem hiding this comment.
Would also be good to clarify why we can't use i.e., scipy.optimize.minimize (which has some output options and custom callback support), and if we ever need mini-batch/partial_fit() support for these experiments or not.
src/gfdl/model.py
Outdated
| seed: int = None, | ||
| reg_alpha: float = None, | ||
| rtol: float | None = None, | ||
| gamma: float = None, |
There was a problem hiding this comment.
Kostas' email on March 29/2026 suggests that there may be interest in the ability to provide different scaling values for different layers and/or to only apply scaling to a subset of layers, so the API design may need careful consideration.
There was a problem hiding this comment.
The api can now pass different gamma for each layer.
|
Beyond the various cleanups and clarifications noted above, this will also need regression tests before it is seriously ready for review. Given the amount of confusion I've seen around gamma scaling, the documentation should also explain how it works with crystal clarity. |
src/gfdl/model.py
Outdated
| Z = H_prev @ w.T + b # (n_samples, n_hidden) | ||
| H_prev = self._activation_fn(Z) | ||
| if self.gamma is not None: | ||
| H_prev *= 1.0 / (H_prev.shape[1] ** self.gamma) |
There was a problem hiding this comment.
@shahyadk-bu, @eiviani-lanl : I would appreciate some feedback about the gamma scaling api implemented here. To my understanding, this is equivalent to Shahyad's implementation here, but still it would nice to get a confirmation from both of you.
|
Hi Ray,
This seems correct. I also have begun using the non-training method of
solving the RVFL directly and have that code in my Git as well if you
wanted to look at that reference as well. It is the folder called
"RVFL_Solved" on my Git and there is a python script called "RVFL_model.py"
which holds my RVFL class and the scaling is implemented there as well.
Sincerely,
Shahyad
…On Wed, Apr 8, 2026 at 4:23 AM Navamita Ray ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/gfdl/model.py
<#70 (comment)>:
> Z = H_prev @ w.T + b # (n_samples, n_hidden)
H_prev = self._activation_fn(Z)
+ if self.gamma is not None:
+ H_prev *= 1.0 / (H_prev.shape[1] ** self.gamma)
@shahyadk-bu <https://github.com/shahyadk-bu>, @eiviani-lanl
<https://github.com/eiviani-lanl> : I would appreciate some feedback
about the gamma scaling api implemented here. To my understanding, this is
equivalent to Shahyad's implementation here
<https://github.com/shahyadk-bu/RVFL_Research/blob/main/RVFL_Trained/RVFL_Model.py#L45>,
but still it would nice to get a confirmation from both of you.
—
Reply to this email directly, view it on GitHub
<#70 (review)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BZ2CZWVMIEDQOE5CSISBSJD4UWZ3ZAVCNFSM6AAAAACWYC7PESVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHM2DANZSGQZDENRXGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi Shahyad, |
| seed: int = None, | ||
| reg_alpha: float = None, | ||
| rtol: float | None = None, | ||
| gamma: float | np.typing.ArrayLike | None = None, |
There was a problem hiding this comment.
here and elsewhere, note that ArrayLike already includes floats and other scalars: https://numpy.org/devdocs/reference/typing.html#numpy.typing.ArrayLike
| before inputting to the next layer. None implies no | ||
| scaling is applied. A single value implies applying | ||
| the scaling each layer with the same value. For different | ||
| gammas per layer, pass an array like type. |
There was a problem hiding this comment.
a float is also an array-like in NumPy speak, so we should probably reword a bit
This draft MR contains the changes needed for gamma scaling experiments.
Addresses issue: #71