-
Notifications
You must be signed in to change notification settings - Fork 755
-
So far we have several ideas on what to do with distros. They have been discussed in several PRs and issues, so as agreed in the latest SIG, they will be all gathered here.
Right now we automatically select the first distro that is found in the respective entry point list. This is a problem because it is not possible to know which distro will be selected.
A. Force the user to have only one distro installed
B. Define which distro will be used by using environment variables
C. Define a priority system
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 4 comments 7 replies
-
I am in favor of B, here is why.
These are the problems I see with A:
- It makes it impossible for an user to reuse code from another distro.
- It also makes it hard to test with one distro then with another one because that would mean uninstalling a distro, then installing another.
- It actually does not stop the user from installing another distro. This is bad because the error will not be raised when more than one distro is installed, but when
opentelemetry-auto-instrumentationis run, which is too late.
This is the problem I see with C:
- Let's say there are
ndistros installed,D1, D2, ..., Dn, each with increasing priorities. If the user wants to selectDi, then the user has to uninstalln - idistros.
Now, I think the common argument against B is that it requires the user to set an environment variable to be able to select the distro if there is more than one distro installed and there is the intention that distros require no further user interaction. I think having to use some mechanism to select the distro when there is more than one available is inevitable. As such, we should just use the standard way of doing so by setting an environment variable.
Beta Was this translation helpful? Give feedback.
All reactions
-
It also makes it hard to test with one distro then with another one because that would mean uninstalling a distro, then installing another.
I still don't see why a user would want to do this and In the unlikely case they did, I don't see why uninstalling and installing another one is that much of a problem. Sounds kind of similar to someone trying to figure out whether to use flask or pyramid. I think it's fine to expect users to uninstall one or the other.
It actually does not stop the user from installing another distro. This is bad because the error will not be raised when more than one distro is installed, but when opentelemetry-auto-instrumentation is run, which is too late.
I don't think it's too late at all. It's not like anyone will install multiple distros and ship something straight to production. Even if someone did ship directly to production, it'd error out and tell them exactly what is wrong and what needs to be fixed. If we even supported selecting one with an environment variable, projects like these would still error out or worse, behave erratically in production unless the env var was set up.
Beta Was this translation helpful? Give feedback.
All reactions
-
Let's say there are n distros installed, D1, D2, ..., Dn, each with increasing priorities. If the user wants to select Di, then the user has to uninstall n - i distros.
Not sure in what scenario a user would want to "select" a specific distro. Users would just install a single distro they intend to use and it would just work out of the box. The idea is that if a distro want to use another distro as a library, the distro needs to bump its priority number to something higher than the distro it uses as a library.
For example,
- Let's say the default distro has priority 10
- If distro-A subclasses the default distro, then distro-A would set its own priority to 20.
- Now if distro-B wanted to subclass distro-A, it'd set its own priority to 30.
Now if an end-user installed the default distro, it'd obviously "just work". If they instead installed "distro-A", it'll pull in the default distro as a dependency but still, distro-A would be the one that'd be used because of the implicit priority it ships with. Now if instead, the user installed distro-B, even though it'd pull in the default distro and distro-A as dependencies, distro-B would still be the one that ends up being used. In all cases above, the system worked exactly as the user intended it to work i.e, the exact same distro was used that the user installed with pip install or added to dependencies.
This means if a user is following Lightstep getting started guide, they'd always end up using the Lightstep distro even if the Lightstep distro internally depends on 10 different distros it re-uses code from.
This solution does allow developers to re-use distros as libraries to build other distros while preventing any of the complexity from leaking into the userspace. Users don't need to learn anything new or configure anything extra. They just run pip install some-distro and start their service with opentelemetry-instrument, and everything works exactly as expected.
Beta Was this translation helpful? Give feedback.
All reactions
-
Not sure in what scenario a user would want to "select" a specific distro.
When testing distros.
They just run
pip install some-distroand start their service withopentelemetry-instrument, and everything works exactly as expected.
Not necessarily. With this approach the user first needs to check every installed distro priority to make sure that the priority of their selected distro is the highest, which leads the user to another problem: what if it is not? How do you raise the priority of a distro you want to use if you are not the distro developer?
Beta Was this translation helpful? Give feedback.
All reactions
-
Not necessarily. With this approach the user first needs to check every installed distro priority to make sure that the priority of their selected distro is the highest, which leads the user to another problem: what if it is not? How do you raise the priority of a distro you want to use if you are not the distro developer?
No, users do not need to do anything. They don't even need to know that multiple distros are installed. May be I was not clear enough in my original description. This is a solution to let distro authors re-use code from other distros and still guarantee deterministic behaviour when someone installs their (and only their) distro. This does not aim to enable users to install multiple distros at once and then pick one. The idea is that the priority/weight is not a user-facing concept, it'd be something only exposed to and used by distro authors as a way to provide hints to the instrument command.
Imagine another solution for a second. The instrument command loads all installed distros, inspects the inheritance tree between them and always selects the leaf node/distro. It'd be a way to figure out which distro is being used as a library by other distros and which distro the user actually intended to install.
Priority as metadata guarantees this without all the runtime magic and with extremely strong guarantees. It still does not allow users to manually install multiple distros and test them one by one without uninstalling but that is not the problem we are facing here.
Beta Was this translation helpful? Give feedback.
All reactions
-
Solution B is proposed in #1966.
Beta Was this translation helpful? Give feedback.
All reactions
-
I still think we are mixing up two different problems here.
Users installing multiple distros
Is there an actual use case that requires end-users to install multiple distros? If so, we should open a feature request for that and discuss if and how we want to enable such a use case. I can't think of any reason why someone would want to install multiple distros but intend to use only one.
So far we do not support having multiple distros installed. If users do it, they'll observe undefined behaviour. We should improve this by raising an exception instead when more than one distros are found so users don't get tripped by silent behaviour changes and instead have a deterministic outcome (failure).
In a nutshell, this is not supported today and I don't see any real-world scenario where we'd need to support this. We should explicitly call this out in docs and make the instrument command fail-fast in such cases.
Sharing common code between distros
This is probably worth solving but not at the cost of complicating the end-user experience. DRY is good but it has to stop if following DRY results in end-users having to learn, configure and deal with unnecessary stuff.
I personally would just copy all this code to every distro and let it live there. It's not like a complex algorithm that needs to work everything exactly the same way. If a distro author copies code to configure a tracer provider, processor and exporter, I doubt it'll need changes/fixes if something changes in the default distro. If things work in Vendor A's distro, they have no real need to keep updating the distro code to match the default distro. It's just glue/config code mostly and does not benefit too much from DRY.
If we still wanted to follow DRY for distros, the solution is actually pretty simple. Just move the code as reusable functions to a common package that all distros can import and use. This solves the problem completely in practice. As a result distros by convention would only host very vendor-specific code and it wouldn't even make sense to subclass them.
However, if we really wanted to make distros extensible in order to build other distros, I think something like solution C would enable all such use cases without letting the problem "bleed" into the userspace and force the user to learn yet another concept.
Beta Was this translation helpful? Give feedback.
All reactions
-
I still think we are mixing up two different problems here.
These two different problems are tightly coupled together depending on your approach. If the approach for sharing distro code is the standard way of installing a package (that you can create yourself) and importing from it, then it becomes necessary that multiple distros can be installed.
Users installing multiple distros
Is there an actual use case that requires end-users to install multiple distros? If so, we should open a feature request for that and discuss if and how we want to enable such a use case. I can't think of any reason why someone would want to install multiple distros but intend to use only one.
I can, testing distros. Also, I can imagine this situation: a user wants to use all the configuration that distro A makes with one exception. So, the standard solution would be to install distro A, create distro B that inherits from A, call super to run the configuration from A, then reconfigure again the value they want that differs from the configuration that A provides.
Sharing common code between distros
This is probably worth solving but not at the cost of complicating the end-user experience. DRY is good but it has to stop if following DRY results in end-users having to learn, configure and deal with unnecessary stuff.
I personally would just copy all this code to every distro and let it live there. It's not like a complex algorithm that needs to work everything exactly the same way. If a distro author copies code to configure a tracer provider, processor and exporter, I doubt it'll need changes/fixes if something changes in the default distro. If things work in Vendor A's distro, they have no real need to keep updating the distro code to match the default distro. It's just glue/config code mostly and does not benefit too much from DRY.
If we still wanted to follow DRY for distros, the solution is actually pretty simple. Just move the code as reusable functions to a common package that all distros can import and use. This solves the problem completely in practice.
It doesn't. It creates another problem. Let's say we put reusable code in a certain package. What if the user needs to use that certain code but with a slight change? The user can't change that code, only we can.
Beta Was this translation helpful? Give feedback.
All reactions
-
I can, testing distros.
I can, testing distros. Also, I can imagine this situation: a user wants to use all the configurations that distro A makes with one exception. So, the standard solution would be to install distro A, create distro B that inherits from A, call super to run the configuration from A, then reconfigure again the value they want that differs from the configuration that A provides.
I think our recommendation should be to install only one distro test it, uninstall and install another one. We've received zero complains/feedback about this. I don't think this is a real-world scenario TBH. Taking on so much complexity just to avoid pip uninstall in development isn't worth it IMO.
Distros are highly tied to vendors. I'm having a hard time imagining people testing different distros before deciding which APM vendor they want to use. It's usually going to be the other way round. While testing different vendors, running pip install/uninstall is going to be a very insignificant part of the process. People will usually need to install distro, additional packages, configure their services, deploy to some environment and then spent a lot of time evaluating the APM product. Eliminating one pip uninstall is not gonna make a difference here. Even if it did, I think we should optimize for real-world production use cases, not test/development especially if solving something for testing potentially makes production cases more complicated.
Beta Was this translation helpful? Give feedback.
All reactions
-
It doesn't. It creates another problem. Let's say we put reusable code in a certain package. What if the user needs to use that certain code but with a slight change? The user can't change that code, only we can.
Sorry, not sure I follow. Wouldn't this be the exact same thing as any other library or distro if we made it possible to allow installing multiple ones at the same time?
Beta Was this translation helpful? Give feedback.
All reactions
-
I think @owais argument for using priorities to determine which distro is to be used is motivated by an intention of not requiring the user to make any further changes (like setting an environment variable) after the desired distro is installed. Please correct me if I am wrong in my understanding, @owais. I also understand that the priority system suggested above by @owais attempts to allow the reuse of distro code by other distros.
I fundamentally agree with the intention of not requiring further user input after a distro is installed. I also fundamentally agree with the reuse of distro code by other distros. We have been working towards a solution that makes it easy for our users to use OpenTelemetry Python, with the goal being a single action that provides a code-ready environment for instrumentation.
What I fundamentally disagree is with making this or any other solution a component that is coupled with spec-defined components. In my opinion, making the experience easy for our users is a goal in itself, separate from spec compliance and must be completely decoupled from the SDK, API, instrumentation, etc.
There are many ways in which we can make the experience of using OpenTelemetry Python easier, I can even imagine something like this:
pip install opentelemetry-quickstart
an hypothetical package that installs an opentelemetry-quickstart command that does everything that a user needs to start using our project by using distros or any other mechanism.
Distros are just a tool to make things easier for our users, another in many different tools that we can create for the same purpose.
A major part of the discussion above revolves around the actual implementation of distros. I think we need to take a step back first and make a decision regarding if and how will we make the separation between convenience tools like distro and the spec-defined components. Here is my position regarding the if and the how:
- If. Yes, we must keep convenience tools separated and decoupled from the spec-defined components.
- How. I suggest we use the standard mechanism, entry points.
The entry point approach linked above provides the user with hooks for doing anything before and after instrumentation happens. In this way distros become just another tool that uses one of these hooks that we provide along with the rest of the spec-defined project for our user convenience and can be used or not by the user.
I care much more about the separation between convenience tools from the spec-defined components than with the actual implementation of such tools.
Beta Was this translation helpful? Give feedback.