Run many packages in parallel to avoid overhead and make use of CPUS #16

Merged
Ghost merged 5 commits from parallize_packages into main 2022-11-04 10:04:15 +01:00
First-time contributor
No description provided.
Ghost added 3 commits 2022-11-04 07:49:52 +01:00
This way we avoid duplicating all startup and SQL queries
Threads appear to be too dangerous for this
aplanas reviewed 2022-11-04 08:13:53 +01:00
git-importer.py Outdated
@ -95,3 +103,3 @@
if args.export:
TestExporter(args.package).run()
if len(args.packages) != 0:
Owner

You mean > 1?

You mean `> 1`?
Author
First-time contributor

Actually I meant != 1 :)

Actually I meant != 1 :)
aplanas marked this conversation as resolved
git-importer.py Outdated
@ -110,3 +116,1 @@
)
exporter.set_gc_interval(args.gc)
exporter.export_as_git()
with concurrent.futures.ProcessPoolExecutor() as executor:
Owner

Making max_workers to None will create one process per CPU. That is OK in my desktop, but laptop report like 32 of them, and I am sure this will kill OBS. Maybe create a parameter with some low default?

Making `max_workers` to `None` will create one process per CPU. That is OK in my desktop, but laptop report like 32 of them, and I am sure this will kill OBS. Maybe create a parameter with some low default?
Author
First-time contributor

I guess I can just hard code to 8. Would that work for you?

I guess I can just hard code to 8. Would that work for you?
Owner

seems safer

seems safer
aplanas marked this conversation as resolved
@ -205,0 +222,4 @@
]
concurrent.futures.wait(fs)
self.db.conn.commit()
Owner

Why is this required?

Why is this required?
Author
First-time contributor

It's just to avoid wasting too much time when the script exists in the followup step.

It's just to avoid wasting too much time when the script exists in the followup step.
aplanas marked this conversation as resolved
@ -205,0 +230,4 @@
]
concurrent.futures.wait(fs)
self.db.conn.commit()
Owner

Same? Is there any transaction open?

Same? Is there any transaction open?
aplanas marked this conversation as resolved
@ -205,0 +243,4 @@
for project, package in cur.fetchall()
]
concurrent.futures.wait(fs)
self.db.conn.commit()
Owner

Is not closing the cursor with doing the commit?

Is not closing the cursor `with` doing the commit?
Author
First-time contributor

No, only closing the database connection does. Everything until then stays in a transaction.

No, only closing the database connection does. Everything until then stays in a transaction.
aplanas marked this conversation as resolved
Ghost added 1 commit 2022-11-04 09:58:46 +01:00
aplanas approved these changes 2022-11-04 09:59:11 +01:00
Ghost added 1 commit 2022-11-04 10:01:36 +01:00
This is hard coding the limit, we may want to make this configurable
but for now the machines supposed to run this code are very similiar
aplanas approved these changes 2022-11-04 10:02:43 +01:00
Ghost merged commit 4cc0a23d4e into main 2022-11-04 10:04:15 +01:00
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: importers/git-importer#16
No description provided.