Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
On 20 February 2014 16:20, Chris Cunnington websela@yahoo.com wrote:
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
If by "address hanging issues" you're talking about the Squeak processes that get left orphaned after builds, they can only get resolved by keeping things running and having someone sufficiently knowledgeable (which, I suspect, is just me) digging into the problem. Mainly it's a coordination problem between rake, Ruby shell commands, and Squeak processes.
But I'm away on business all of next week and will have no chance to do any work on CI, so from that perspective I'm fine with Jenkins taking the week off. As long as we advertise it widely, and put up a holding page explaining the story on build.squeak.org.
Thanks for consulting with me! (I hope my text conveys a lack of sarcasm here!)
frank
Chris
What problem are we trying to solve here?
If there are Jenkins jobs that cause problems, and if those problems cannot be addressed right away, then the appropriate thing to do is disable them using the normal Jenkins console. If an explanation is needed, just update the job description to say what is going on.
A little bit of updating of the Jenkins job descriptions would do no harm in any case. Sort of like a class comment: "I am a Jenkins job that tests the FreebleBaz package. If I stop working, please contact bilbo@baggins.org".
:)
Dave
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
Let me just list the issues I'm aware of, not that these can all be fixed in the same way or require any significant overall downtime.
1. Jenkins broke some of our build processes with a release months ago. Since that time we have been pinned to a specific release and have not updated. Initially the plan was to be agile and keep up to date with Jenkins releases, but no one has found the time to figure out why the builds broke or at least the proper way to address the problem. I know Frank tried but he has only so much time and other fish to fry. I approached Chris C as he was the original instigator for Jenkins to try to see if he had the interest to help Frank out.
2. The issue I have harped on about in the past about filling up the filesystem on box3. I'm convinced that Jenkins jobs are wasting space somewhere or that maybe there are some jobs that can be deleted? I'm just speculating, but there are a number of jobs that have not succeeded in months. By the way growth has been generally slow of late but we are at 97%, no immediate fear but 'vigilance!'. If ultimately build.squeak.org is as big as it is because it has to be, then we probably need to approach SFC and see if there is budget to upgrade the disk space on box3. That's not my first choice however.
3. The issue that Chris has referred to which is that we still get jobs stuck fairly regularly that have to be killed manually.
Ken
On 02/20/2014 11:11 AM, David T. Lewis wrote:
What problem are we trying to solve here?
If there are Jenkins jobs that cause problems, and if those problems cannot be addressed right away, then the appropriate thing to do is disable them using the normal Jenkins console. If an explanation is needed, just update the job description to say what is going on.
A little bit of updating of the Jenkins job descriptions would do no harm in any case. Sort of like a class comment: "I am a Jenkins job that tests the FreebleBaz package. If I stop working, please contact bilbo@baggins.org".
:)
Dave
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
Ken, thanks for the explanation. I recognize all of those issues.
I do think it is more appropriate to use the Jenkins UI to turn off the problematic jobs until the issues can be addressed, as opposed to shutting down the whole Jenkins system.
Yes, some of our Jenkins jobs are wasting a lot of space. Yes, that is a fixable problem. No I don't think that a bigger disk drive will fix it ;-)
Dave
Let me just list the issues I'm aware of, not that these can all be fixed in the same way or require any significant overall downtime.
- Jenkins broke some of our build processes with a release months ago.
Since that time we have been pinned to a specific release and have not updated. Initially the plan was to be agile and keep up to date with Jenkins releases, but no one has found the time to figure out why the builds broke or at least the proper way to address the problem. I know Frank tried but he has only so much time and other fish to fry. I approached Chris C as he was the original instigator for Jenkins to try to see if he had the interest to help Frank out.
- The issue I have harped on about in the past about filling up the
filesystem on box3. I'm convinced that Jenkins jobs are wasting space somewhere or that maybe there are some jobs that can be deleted? I'm just speculating, but there are a number of jobs that have not succeeded in months. By the way growth has been generally slow of late but we are at 97%, no immediate fear but 'vigilance!'. If ultimately build.squeak.org is as big as it is because it has to be, then we probably need to approach SFC and see if there is budget to upgrade the disk space on box3. That's not my first choice however.
- The issue that Chris has referred to which is that we still get jobs
stuck fairly regularly that have to be killed manually.
Ken
On 02/20/2014 11:11 AM, David T. Lewis wrote:
What problem are we trying to solve here?
If there are Jenkins jobs that cause problems, and if those problems cannot be addressed right away, then the appropriate thing to do is disable them using the normal Jenkins console. If an explanation is needed, just update the job description to say what is going on.
A little bit of updating of the Jenkins job descriptions would do no harm in any case. Sort of like a class comment: "I am a Jenkins job that tests the FreebleBaz package. If I stop working, please contact bilbo@baggins.org".
:)
Dave
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
OK, well, nobody thinks that's a good idea, so it probably isn't.
And yet box2 runs a version of Linux from 2006 or something like that.
Sooo, perhaps there's a season for upgrades and I posit that after version of Squeak has been released may be that time. When Frank goes on his trip he knows Jenkins maybe a bit odd because it's been upgraded to a new version? No?
In the seasons that are Squeak there probably needs to be a window for this kind of thing.
Chris
On Feb 20, 2014, at 3:51 PM, David T. Lewis lewis@mail.msen.com wrote:
Ken, thanks for the explanation. I recognize all of those issues.
I do think it is more appropriate to use the Jenkins UI to turn off the problematic jobs until the issues can be addressed, as opposed to shutting down the whole Jenkins system.
Yes, some of our Jenkins jobs are wasting a lot of space. Yes, that is a fixable problem. No I don't think that a bigger disk drive will fix it ;-)
Dave
Let me just list the issues I'm aware of, not that these can all be fixed in the same way or require any significant overall downtime.
- Jenkins broke some of our build processes with a release months ago.
Since that time we have been pinned to a specific release and have not updated. Initially the plan was to be agile and keep up to date with Jenkins releases, but no one has found the time to figure out why the builds broke or at least the proper way to address the problem. I know Frank tried but he has only so much time and other fish to fry. I approached Chris C as he was the original instigator for Jenkins to try to see if he had the interest to help Frank out.
- The issue I have harped on about in the past about filling up the
filesystem on box3. I'm convinced that Jenkins jobs are wasting space somewhere or that maybe there are some jobs that can be deleted? I'm just speculating, but there are a number of jobs that have not succeeded in months. By the way growth has been generally slow of late but we are at 97%, no immediate fear but 'vigilance!'. If ultimately build.squeak.org is as big as it is because it has to be, then we probably need to approach SFC and see if there is budget to upgrade the disk space on box3. That's not my first choice however.
- The issue that Chris has referred to which is that we still get jobs
stuck fairly regularly that have to be killed manually.
Ken
On 02/20/2014 11:11 AM, David T. Lewis wrote:
What problem are we trying to solve here?
If there are Jenkins jobs that cause problems, and if those problems cannot be addressed right away, then the appropriate thing to do is disable them using the normal Jenkins console. If an explanation is needed, just update the job description to say what is going on.
A little bit of updating of the Jenkins job descriptions would do no harm in any case. Sort of like a class comment: "I am a Jenkins job that tests the FreebleBaz package. If I stop working, please contact bilbo@baggins.org".
:)
Dave
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
On 20 February 2014 22:30, Chris Cunnington websela@yahoo.com wrote:
OK, well, nobody thinks that's a good idea, so it probably isn't.
And yet box2 runs a version of Linux from 2006 or something like that.
Sooo, perhaps there's a season for upgrades and I posit that after version of Squeak has been released may be that time. When Frank goes on his trip he knows Jenkins maybe a bit odd because it's been upgraded to a new version? No?
In the seasons that are Squeak there probably needs to be a window for this kind of thing.
Maybe run the idea past Chris Muller, because CI impacts 4.5's release process. But maybe not heavily: I _think_ I remember Chris & I agreeing that a human would take a ReleaseSqueakTrunk artifact and give it a final polish anyway, and we do have a release candidate out that hasn't had any reported problems...
If Chris is OK with it, and releases 4.5, I think that's a fine time to upgrade bits that need upgrading. We froze Jenkins' auto-upgrading for a time, but it was a stopgap measure. I just won't be able to help out with anything until March.
frank
Chris
On Feb 20, 2014, at 3:51 PM, David T. Lewis lewis@mail.msen.com wrote:
Ken, thanks for the explanation. I recognize all of those issues.
I do think it is more appropriate to use the Jenkins UI to turn off the problematic jobs until the issues can be addressed, as opposed to shutting down the whole Jenkins system.
Yes, some of our Jenkins jobs are wasting a lot of space. Yes, that is a fixable problem. No I don't think that a bigger disk drive will fix it ;-)
Dave
Let me just list the issues I'm aware of, not that these can all be fixed in the same way or require any significant overall downtime.
- Jenkins broke some of our build processes with a release months ago.
Since that time we have been pinned to a specific release and have not updated. Initially the plan was to be agile and keep up to date with Jenkins releases, but no one has found the time to figure out why the builds broke or at least the proper way to address the problem. I know Frank tried but he has only so much time and other fish to fry. I approached Chris C as he was the original instigator for Jenkins to try to see if he had the interest to help Frank out.
- The issue I have harped on about in the past about filling up the
filesystem on box3. I'm convinced that Jenkins jobs are wasting space somewhere or that maybe there are some jobs that can be deleted? I'm just speculating, but there are a number of jobs that have not succeeded in months. By the way growth has been generally slow of late but we are at 97%, no immediate fear but 'vigilance!'. If ultimately build.squeak.org is as big as it is because it has to be, then we probably need to approach SFC and see if there is budget to upgrade the disk space on box3. That's not my first choice however.
- The issue that Chris has referred to which is that we still get jobs
stuck fairly regularly that have to be killed manually.
Ken
On 02/20/2014 11:11 AM, David T. Lewis wrote:
What problem are we trying to solve here?
If there are Jenkins jobs that cause problems, and if those problems cannot be addressed right away, then the appropriate thing to do is disable them using the normal Jenkins console. If an explanation is needed, just update the job description to say what is going on.
A little bit of updating of the Jenkins job descriptions would do no harm in any case. Sort of like a class comment: "I am a Jenkins job that tests the FreebleBaz package. If I stop working, please contact bilbo@baggins.org".
:)
Dave
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
No conflicts here. I didn't use the Jenkins builds the last couple of times because it seemed to have 13352, plus I'm hoping we're done with 4.5 now that 13680 was released (final).
On Thu, Feb 20, 2014 at 4:34 PM, Frank Shearar frank.shearar@gmail.com wrote:
On 20 February 2014 22:30, Chris Cunnington websela@yahoo.com wrote:
OK, well, nobody thinks that's a good idea, so it probably isn't.
And yet box2 runs a version of Linux from 2006 or something like that.
Sooo, perhaps there's a season for upgrades and I posit that after version of Squeak has been released may be that time. When Frank goes on his trip he knows Jenkins maybe a bit odd because it's been upgraded to a new version? No?
In the seasons that are Squeak there probably needs to be a window for this kind of thing.
Maybe run the idea past Chris Muller, because CI impacts 4.5's release process. But maybe not heavily: I _think_ I remember Chris & I agreeing that a human would take a ReleaseSqueakTrunk artifact and give it a final polish anyway, and we do have a release candidate out that hasn't had any reported problems...
If Chris is OK with it, and releases 4.5, I think that's a fine time to upgrade bits that need upgrading. We froze Jenkins' auto-upgrading for a time, but it was a stopgap measure. I just won't be able to help out with anything until March.
frank
Chris
On Feb 20, 2014, at 3:51 PM, David T. Lewis lewis@mail.msen.com wrote:
Ken, thanks for the explanation. I recognize all of those issues.
I do think it is more appropriate to use the Jenkins UI to turn off the problematic jobs until the issues can be addressed, as opposed to shutting down the whole Jenkins system.
Yes, some of our Jenkins jobs are wasting a lot of space. Yes, that is a fixable problem. No I don't think that a bigger disk drive will fix it ;-)
Dave
Let me just list the issues I'm aware of, not that these can all be fixed in the same way or require any significant overall downtime.
- Jenkins broke some of our build processes with a release months ago.
Since that time we have been pinned to a specific release and have not updated. Initially the plan was to be agile and keep up to date with Jenkins releases, but no one has found the time to figure out why the builds broke or at least the proper way to address the problem. I know Frank tried but he has only so much time and other fish to fry. I approached Chris C as he was the original instigator for Jenkins to try to see if he had the interest to help Frank out.
- The issue I have harped on about in the past about filling up the
filesystem on box3. I'm convinced that Jenkins jobs are wasting space somewhere or that maybe there are some jobs that can be deleted? I'm just speculating, but there are a number of jobs that have not succeeded in months. By the way growth has been generally slow of late but we are at 97%, no immediate fear but 'vigilance!'. If ultimately build.squeak.org is as big as it is because it has to be, then we probably need to approach SFC and see if there is budget to upgrade the disk space on box3. That's not my first choice however.
- The issue that Chris has referred to which is that we still get jobs
stuck fairly regularly that have to be killed manually.
Ken
On 02/20/2014 11:11 AM, David T. Lewis wrote:
What problem are we trying to solve here?
If there are Jenkins jobs that cause problems, and if those problems cannot be addressed right away, then the appropriate thing to do is disable them using the normal Jenkins console. If an explanation is needed, just update the job description to say what is going on.
A little bit of updating of the Jenkins job descriptions would do no harm in any case. Sort of like a class comment: "I am a Jenkins job that tests the FreebleBaz package. If I stop working, please contact bilbo@baggins.org".
:)
Dave
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
On 20 February 2014 20:51, David T. Lewis lewis@mail.msen.com wrote:
Ken, thanks for the explanation. I recognize all of those issues.
I do think it is more appropriate to use the Jenkins UI to turn off the problematic jobs until the issues can be addressed, as opposed to shutting down the whole Jenkins system.
Not really. Other than the InterpreterVM and CogVM jobs (I'm running off memory here), all the other jobs use rake, and shell out to run Squeak, and do fancy things to avoid hung builds. Now I'm absolutely 100% sure that the orphaned processes are my fault. Clearly I don't understand the intricacies of shells and subshells and process ownership. The problem is the disconnect between the sets of people who know very well what the squeak-ci code does (me) and people who understand the unix process model (and ownership in particular) (not me).
Yes, some of our Jenkins jobs are wasting a lot of space. Yes, that is a fixable problem. No I don't think that a bigger disk drive will fix it ;-)
I thought I'd take a look at ExternalPackages-Xtreams, one of the bigger jobs at 511M.
By far the biggest part of the disk usage - 221M - is simply the repository itself. This is because we store big fat blobs of binary data (images) in the repository. Upgrading these is simply wasteful. Maybe some serious git guru-ness might be applied to reduce this. I think there are tricks to remove the presence and history of large binaries, but I'd have to look it up.
The target/ directory takes up no less than 195M. It has three VMs (like most of these jobs): each Cog VM directory takes 14M, while the Interpreter VM takes up 38M. (This is the source: because every job can run on any agent, and that agent could have any manner of glibc, we _build_ an Interpreter VM and memoise the artifact). target/package-cache/ takes up 37M, presumably because jobs update the trunk image from the base CI, save that, then load the package under test (Xtreams in this case).
I've started the process of making jobs depend on the binary artifacts of other jobs, which will probably remove the package-cache disk usage. So SqueakTrunk will produce a TrunkImage.image that ReleaseSqueakTrunk will take and produce a Squeak4.5.image, while ExternalPackages-Xtreams will turn the TrunkImage.image into a JUnit test result.
Saving 38M per job means saving 38*33 ~= 1G on disk.
frank
Dave
Let me just list the issues I'm aware of, not that these can all be fixed in the same way or require any significant overall downtime.
- Jenkins broke some of our build processes with a release months ago.
Since that time we have been pinned to a specific release and have not updated. Initially the plan was to be agile and keep up to date with Jenkins releases, but no one has found the time to figure out why the builds broke or at least the proper way to address the problem. I know Frank tried but he has only so much time and other fish to fry. I approached Chris C as he was the original instigator for Jenkins to try to see if he had the interest to help Frank out.
- The issue I have harped on about in the past about filling up the
filesystem on box3. I'm convinced that Jenkins jobs are wasting space somewhere or that maybe there are some jobs that can be deleted? I'm just speculating, but there are a number of jobs that have not succeeded in months. By the way growth has been generally slow of late but we are at 97%, no immediate fear but 'vigilance!'. If ultimately build.squeak.org is as big as it is because it has to be, then we probably need to approach SFC and see if there is budget to upgrade the disk space on box3. That's not my first choice however.
- The issue that Chris has referred to which is that we still get jobs
stuck fairly regularly that have to be killed manually.
Ken
On 02/20/2014 11:11 AM, David T. Lewis wrote:
What problem are we trying to solve here?
If there are Jenkins jobs that cause problems, and if those problems cannot be addressed right away, then the appropriate thing to do is disable them using the normal Jenkins console. If an explanation is needed, just update the job description to say what is going on.
A little bit of updating of the Jenkins job descriptions would do no harm in any case. Sort of like a class comment: "I am a Jenkins job that tests the FreebleBaz package. If I stop working, please contact bilbo@baggins.org".
:)
Dave
Ken and I have been thinking of shutting down Jenkins (OK, it was my idea) for a week after 4.5 is released. The aim is to address hanging issues.
A week is a long time from a technical point of view, but it allows people using it to take a break. Mainly we're thinking of Frank here. We're thinking of upgrades, disk usage, necessary and un-necessary builds (if there are any). Basically stopping that world for a week.
What do you think, Frank? If you are opposed, then we'll chuck this idea.
Chris
box-admins@lists.squeakfoundation.org