Re: Questions re: LF Edge Shared Infrastructure Proposal

Trevor.Conn@...
 

Tracking additional questions I asked, see below.


1.) How will LF mitigate build worker starvation -- which we already see just in our own project, nevermind a shared environment?

2.) Will we maintain the same level of customizability for our build environment, for example the new auto-tagging Jenkins pipeline jobs?

3.) It would be nice to have the ability to cancel jobs. For example, a PR is created which kicks off verify jobs. A couple minutes later the dev resubmits a change to the same PR (like a missing rebase or something). We have to wait for those initial jobs to finish before the second jobs run.

    ** It sounded from the discussion like this was more related to the GitHub plugin, but it's still on my want list **

4.) What sort of dashboard will we have? Will we be able to see the overall shared stats in order to identify delays or quantify usage of the infra by project?


Trevor Conn
Technical Staff Engineer
Core Working Group Chair of EdgeX Foundry
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA


From: EdgeX-TSC-DevOps@... <EdgeX-TSC-DevOps@...> on behalf of Gregg, James R <james.r.gregg@...>
Sent: Monday, August 5, 2019 6:57 PM
To: edgex-tsc-devops@...
Subject: [Edgex-tsc-devops] Questions re: LF Edge Shared Infrastructure Proposal
 

[EXTERNAL EMAIL]

Andy Grimberg made an initial proposal in the DevOps WG meeting last week , introducing an opportunity for using a shared infrastructure for all projects falling under the LF Edge umbrella.  Based on that initial discussion, I have the following list of questions which would be helpful to gain clarification about the project in terms of scope, schedule, budget.

 

Here are the additional “sandblast” of questions we did not specifically get a chance to answer in that meeting.

 

Scope

===========

Scope of the work

               - What work will the LF own within the proposed scope of work to transition to a common shared infrastructure?

               - What help is needed from the DevOps Open Source community?

Are there any other LF projects sharing a common build infrastructure?

 

What will the new GitHub repo structure look like?

What is the LF Release Engineer's role going to be in terms of ensuring that name collisions do not occur?

If it’s a shared infrastructure, do we have an opportunity to leverage common base build images for builds which are building Docker images?

 

It seems like one of the reasons for this proposal was to address a desire to reduce the size of duplicated infrastructure and thereby reduce technical debt when it comes to maintenance and ongoing support of the infrastructure.  Is there a plan to share a common DevOps WG under the LF Edge project so that communications are centralized?

 

Does the Linux Foundation have a plan to release roadmaps for when technical debt will be addressed? 

 

What other alternatives has the Linux Foundation considered to address the problem?

Are there any plans to look at leveraging Kubernetes for hosting build automation and leveraging more of a Container as a Service build automation model?

Has the Linux Foundation looked at Rancher Labs Rancher 2.0 for hosting a more modern CaaS platform?

What changes will be necessary to support all of the packer build images?

 

What optimizations of the shared common infrastructure will improve the overall build automation performance?

               - We have noticed service degradation when pulling images from upstream repos (docker hub or other repos)

               - We have seen what appears to be network degradation at times when pulling build dependencies

               - What other technical debt would be addressed within the proposed scope of work?

               - How will ARM builds be optimized in the new proposed shared / common infrastructure?

                              Note: ARM builds slower than Non-ARM

                              If there's more ARM builds happening (due to the shared infrastructure), how would the builds not all take longer to complete?

Schedule

===========

What's the timing for the proposal?

               - Need proposed start - end dates that do not conflict with release dates and/or current development

Are the resources committed to do the actual work within the timeframes?

How will the work be coordinated so as to not disrupt current development?

 

 

Budget

===========

What's the data that supports the claim that supports the proposal for a shared common infrastructure of all LF Edge projects?

Has the cost analysis been completed?

- please share the cost analysis of the proposed savings

               - number of builds for all projects (considered small if under < 500 total builds) - What happens if the next project to join LF Edge bumps the number of builds >1000?

               - shared resource model = shared support resources between LF Edge umbrella projects

Does the shared common build infrastructure also mean that the LF head count to support that build infrastructure is shared across all of the same LF Edge projects?

What's the actual savings?

 

 

Thank you Andy for coming into the DevOps WG meeting last week.  Hopefully we can get answers and clarify further so we can make some decisions and perhaps plan accordingly.

 

I appreciate your responses in advance and please let me know if any of my questions are not clear.

 

James Gregg

IOTG RBHE DevOps | EdgeX Foundry DevOps

Email: james.r.gregg@...

Tel: (480) 552-7965

 

Join EdgeX-TSC-DevOps@lists.edgexfoundry.org to automatically receive all group messages.