Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
G
galore-replication
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Riko Corwin Uphoff
galore-replication
Commits
1ea710a6
Commit
1ea710a6
authored
5 days ago
by
Riko Corwin Uphoff
Browse files
Options
Downloads
Plain Diff
Merged main (scheduler fix)
parents
06c069aa
2687e71a
No related branches found
No related tags found
No related merge requests found
Pipeline
#25397
passed
5 days ago
Stage: build
Changes
3
Pipelines
2
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
load_lr_scheduler.py
+2
-2
2 additions, 2 deletions
load_lr_scheduler.py
load_optimizers.py
+2
-2
2 additions, 2 deletions
load_optimizers.py
scripts/windows/test.bat
+15
-9
15 additions, 9 deletions
scripts/windows/test.bat
with
19 additions
and
13 deletions
load_lr_scheduler.py
+
2
−
2
View file @
1ea710a6
...
...
@@ -17,9 +17,9 @@ def get_scheduler(
warm_up_scheduler
=
ConstantLR
(
optimizer
,
1.0
,
warm_up_steps
)
if
scheduler_type
==
"
constant
"
:
annealing_scheduler
=
ConstantLR
(
optimizer
,
max_lr
,
annealing_steps
)
annealing_scheduler
=
ConstantLR
(
optimizer
,
1.0
,
annealing_steps
)
elif
scheduler_type
==
"
linear
"
:
annealing_scheduler
=
LinearLR
(
optimizer
,
max_lr
,
min
_lr
,
annealing_steps
)
annealing_scheduler
=
LinearLR
(
optimizer
,
1.0
,
min_lr
/
max
_lr
,
annealing_steps
)
elif
scheduler_type
==
"
cosine
"
:
annealing_scheduler
=
CosineAnnealingLR
(
optimizer
,
annealing_steps
,
min_lr
)
else
:
...
...
This diff is collapsed.
Click to expand it.
load_optimizers.py
+
2
−
2
View file @
1ea710a6
...
...
@@ -45,7 +45,7 @@ def load_galore_config(args):
def
get_optimizer
(
args
,
model
):
"""
Creates optimizer (GaLore, LoRa, or baseline AdamW)
"""
default_lr
=
1.0
# Will be scheduled by LRSchedule
r
default_lr
=
args
.
l
r
if
args
.
optimizer
==
"
baseline
"
:
return
AdamW
(
model
.
parameters
(),
lr
=
default_lr
,
weight_decay
=
args
.
weight_decay
),
model
...
...
@@ -66,7 +66,7 @@ def get_optimizer(args, model):
if
args
.
optimizer
==
"
lora
"
:
return
AdamW
(
model
.
parameters
(),
lr
=
args
.
lr
),
model
else
:
galore_config
=
load_galore_config
()
galore_config
=
load_galore_config
(
args
)
trainable_params
=
[
p
for
p
in
model
.
parameters
()
if
p
.
requires_grad
and
p
.
dim
()
>
1
]
param_groups
=
[
{
"
params
"
:
trainable_params
,
**
galore_config
}
...
...
This diff is collapsed.
Click to expand it.
scripts/windows/test.bat
+
15
−
9
View file @
1ea710a6
@echo
off
python
main
.py
^
--mode
pretrai
ning
^
--mode
finetu
ning
^
--optimizer
galore
^
--model
llama_60m
^
--batch
_size
8
^
--model
roberta
^
--dataset
glue_cola
^
--batch
_size
32
^
--num
_epochs
30
^
--max
_length
512
^
--num
_training_tokens
1000000
^
--shuffle
false
^
--dtype
bf16
^
--lr
4
e
-
4
^
--weight
_decay
0
.01
^
--tmax
30
^
--test
true
\ No newline at end of file
--lr
_scheduler
constant
^
--lr
1
e
-
5
^
--lr
_min
1
e
-
8
^
--warm
_up_fraction
0
^
--weight
_decay
0
^
--rank
8
^
--galore
_alpha
2
^
--galore
_T
200
^
--lora
_alpha
8
^
--lora
_dropout
0
.1
^
--test
false
\ No newline at end of file
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment