\

Opaque Types in Python

80 points - last Saturday at 1:17 PM

Source
  • gorgoiler

    today at 5:06 PM

    An alternative to consider might be to accept a Literal[“fast”, “slow”] or an Enum FAST or SLOW, and then decode that into shipping options inside the shipping code.

    Only then are you truly putting a solid boundary between your library and the folks using your library. Everything else is just praying that you and only you have an underscore on your keyboard! :)

    And of course another alternative is to accept that there is no true private in Python other than defdef*, so you allow your ShippingOption to be publicly visible while also documenting that the helper-constructors are what should really be used.

    *”defdef” as in function definitions inside other function definitions — closures if you will, although I prefer to write mine as taking most if not all their parameters explicitly:

      def public(foo):
        def private(foo):
          …
    
        class Private:
          …  # less common
    
        …

    • nayuki

      today at 4:34 PM

      Java made opaque types possible from the very start by private and package-private constructors.

      It's sad to see that many features regarding object-oriented programming and static typing are implemented worse in Python than Java. Various examples: __str__() vs. toString(); underscore vs. private; @staticmethod/@classmethod vs. static; generic types are so clunky in Python; types are not shown in the official Python standand library documentation; __init__() doesn't force you to call super() whereas it's mandatory in Java; @override (Python 3.12; year 2023) copying Java @Override (JDK 1.5; year 2004) very late; convention changing from duck typing (always available in Python) to structural typing (optional in Python, mandatory in Java).

        • zephyrthenoble

          today at 4:47 PM

          I agree that these are implemented "worse" than Java but that's because Python wants these things to be optional! You can complain about it as much as you want but that just means it's the wrong language for you, if that's important to you.

            • nayuki

              today at 5:20 PM

              (I'd like to note that I wrote a lot of code in Java and Python, and continue to use each language in its respective strong areas. This isn't meant as a drive-by attack of "Java r00lz, Python sux"; this is an experienced take.)

              My real problem with the evolution of Python is that initially, the language and the community was positioned as anti-Java, anti-big-OOP-like-C++, and then it changed into the thing that it was against, but in a roundabout and suboptimal way. To me, the initial vibe of Python was, "write a 100-line script, don't worry about explicitly documenting types, don't worry about grand architecture, don't worry about creating custom classes, don't worry about encapsulation and public/private". I've been with Python since year 2007 in the 2.x days, and Java since 2002.

              Initial examples: Why go through the ceremony of `public static void main(String[] args)` when Python just executes the script line by line at the top level? Oh wait, now you have things like `import` actually executing code instead of simply being a compile-time namespace convenience, and you need weird techniques like `if __name__ == "__main__"`. Why `System.out.println()` when `print()` is so much more concise? But now you're polluting the global namespace, and `print(file=sys.stderr)` isn't that elegant either.

              Static typing in Python is the biggest hypocrisy ever. As I understood it, Python scripts were meant to be lightweight and free of the tyranny of enterprise OOP which was epitomized by Java. But people found out that keeping track of types in your head is laborious and error-prone, and getting a compiler to check {that the shape of your objects and function calls match} is a huge productivity boost. And so Python 3 enabled static type hints... which, like I said before, Java had from day zero. To make matters worse, static type hint features were introduced progressively over the years, leading to things getting deprecated from the `typing` module and moved to things like `T|None` and `list[T]` and `collections.abc`.

              IIRC the old practice in Python was that you specified some kind of interface in prose or in code (e.g. `class IoStream: def read(); def close()`), but you didn't need to explicitly use that interface as a superclass; you can just duck-type your way around things. But this completely goes against static typing, so I'm pretty sure the new preferred way is to explicitly use abstract superclasses... just like Java did all along (and is mandatory).

              I really don't think having top-level (module) variables and functions in Python is a good thing, especially because then they are duplicated as fields and methods in classes. In Java, fields and methods (whether static or instance) can only be placed in classes, and I think this particular straitjacket is a good thing.

              > because Python wants these things to be optional

              We can both agree that Python gives multiple ways to do things (e.g. no static type hints vs. static type hints). This flies in the face of:

              > Readability counts.

              > The Zen of Python / There should be one-- and preferably only one --obvious way to do it. -- https://peps.python.org/pep-0020/

              Probably the most tragic example is the ways to build up strings in Python: `+` and str(), `%` operator, `str.format()`, f-string.

              (To be fair, I have a laundry list of complaints about Java too, such as: .class files and the JVM being an intermediate layer that needs to be understood which is actually different from the Java source language, lack of in-place structs so `new Point[]` is very painful on the memory system, awkward string interpolation/formatting compared to Python's f-strings, very awkward JDBC compared to for example Python sqlite3 API, kinda clunky for web server programming, very awkward JSON handling, enterprisey libraries and APIs that are perfectly documented but are impossible to actually understand.)

                • anon7725

                  today at 5:42 PM

                  > so I'm pretty sure the new preferred way is to explicitly use abstract superclasses... just like Java did all along (and is mandatory).

                  typing.Protocol is a good fit for this use case

                    from typing import Protocol
                    
                    class HasMessage(Protocol):
                        def get_message(self) -> str: ...
                    
                    class A:
                        """Implicit (duck-typed)"""
                        def get_message(self) -> str:
                            return "A"
                    
                    class B(HasMessage):
                        """Explicit"""
                        def get_message(self) -> str:
                            return "B"
                    
                    class C:
                        def get_message(self) -> int:
                            return 1
                    
                    def print_message(m: HasMessage) -> None:
                        print(m.get_message())
                    
                    print_message(A())
                    print_message(B())
                    print_message(C())  # fails type check

                  • david422

                    today at 5:42 PM

                    > Static typing in Python is the biggest hypocrisy ever

                    Yes, agreed. I used to work on a large python codebase and tried to add type hints where I could. The issue is that python was not the right tool for the job - except that switching to the right tool was a non-starter. So type hints were the best I could do.

            • causal

              today at 5:45 PM

              Python is much older than Java, and Java is a big OO-first language. It's a bit like saying Python doesn't do functional as well as Erlang.

          • jnwatson

            today at 3:07 PM

            You're holding it (Python) wrong. Python OO was a counter reaction to the bondage and discipline that languages like C++ had with private members and protected inheritance.

            If you have members that users probably shouldn't touch, you prepend them with an underscore. This is just a hint; It doesn't actually change anything. We're all adults here and we know the consequences of reaching into implementation details.

              • ddavis

                today at 3:16 PM

                I agreed with this 100% for a long time. Then I started working on a library at $WORK with dozens of downstream users abusing the hell out of my idiomatic underscore usage, especially in the context of lazy tests with folks writing endless mocks. When I’d “break” their test suite (blocking some time sensitive release) I’d get all kinds of shit. But _they_ were breaking the contract. Unfortunately I had little (if any) control on the path of application code making it to production (yeah yeah not great engineering org, but it’s the world I lived in). Strategies like this post would be helpful for said situations.

                  • senkora

                    today at 4:17 PM

                    There’s always the extra idiomatic __SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED for coworkers that can’t take a hint.

                    https://github.com/reactjs/react.dev/issues/3896

                    • jghn

                      today at 3:59 PM

                      Something similar happened to me. I told those groups to pound sand because they knew they were relying on something which they should not. Manager had my back, they whined a lot but they had to change and improve their processes.

                  • sdeframond

                    today at 3:18 PM

                    > We're all adults here and we know the consequences of reaching into implementation details.

                    I wish you were right but, IMHE, it requires a lot of communication once teams grow and many team member do not fully understand the consequences of what they do. It is nice to have something that helps when reviewing code.

                    > If you have members that users probably shouldn't touch, you prepend them with an underscore

                    Well, this is precisely what TFA does. It prepends the constructor with an underscore.

                    • masklinn

                      today at 3:28 PM

                      I think you missed the issue at hand:

                      > even if you keep all your fields private, the constructor is still, inherently, public.

                      ShippingOptions and the literals / enums are part of the public API, so the user would just be writing

                          ShippingOptions(Carrier.USPS, Conveyance.Air)
                      
                      with no hint that they're doing anything wrong.

                      Dataclasses do have a `kw_only` option, but I'm not sure how well underscore prefixes would be understood as private parameters / a private ctor, whereas wrapping a clearly "private" type should be clear to everybody.

                      Glyph is not entirely correct on the "any class" bit as you can always break the default init path:

                        class ShippingOptions:
                            _ship: Literal["fast", "normal", "slow"]
                            __init__ = None
                      
                      
                        def shipFast() -> ShippingOptions:
                            opts = object.__new__(ShippingOptions)
                            opts._ship = "fast"
                            return opts
                      
                      however that's a pretty ugly pattern, and unlike the one they propose I doubt tooling would understand it.

                      • 5691827

                        today at 3:31 PM

                        "Glyph" knows. He has been in the Python inner circle for decades back to when the circle promoted "spam and eggs" and "consenting adults".

                        Like the rest of that circle, he moves with the times, supports public shaming of Tim Peters and others and now promotes poorly implemented information hiding so Python ticks a few more boxes for the industry.

                        Information hiding in a language that allows changing the values of small integers at runtime via ctypes is doomed anyway. And there are plenty of better languages that do it out of the box and in a straightforward manner.

                    • sdeframond

                      today at 3:30 PM

                      Funny, I ran into the same pattern just a few months ago!

                      In practice, I found it difficult for coworkers to read and understand so I dropped the idea.

                      Another limitation I found is that it breaks down when you start using inheritance. For example:

                      ```

                      class _A: pass

                      A = NewType("A", _A)

                      class _B(_A): pass

                      B = NewType("B", _B)

                      def foo(a: A) -> None: pass

                      b = B(_B())

                      foo(b) # Mypy is not happy: Argument 1 to "foo" has incompatible type "B"; expected "A"

                      foo(A(b)) # Mypy is OK

                      ```

                        • whilenot-dev

                          today at 5:28 PM

                          Just use a generic and make it bound to (A, B):

                              from typing import *
                              
                              
                              class _A:
                                  pass
                              
                              class _B(_A):
                                  pass
                              
                              A = NewType("A", _A)
                              B = NewType("B", _B)
                              
                              def foo[T: (A, B)](val: T) -> T:
                                  return val
                              
                              a = A(_A())
                              b = B(_B())
                              
                              _a = foo(a)
                              _b = foo(b)
                              
                              reveal_type(_a)
                              reveal_type(_b)
                          
                          Playground here: https://mypy-play.net/?mypy=latest&python=3.12&gist=36573363...

                          • simonw

                            today at 4:53 PM

                            (On Hacker News you can do code blocks by indenting each line with two spaces.)

                              • sdeframond

                                today at 5:09 PM

                                Aaah nice! Thank you!

                        • CreRecombinase

                          today at 4:55 PM

                          Why not just use a dictionary, or why not just leave the type unannotated? If you really can't (or don't want to) say anything about the type, then don't. Python is dynamically typed!

                            • MeetingsBrowser

                              today at 5:08 PM

                              The blog post does want to share some type information with users. They just want to prevent users from relying on a specific implementation of that type.

                              They are basically describing a public API backed by a private type that they can extend, rearrange, or otherwise modify without breaking the public contract.

                              • sdeframond

                                today at 5:06 PM

                                The point is to mark the constructor as "private" so that it is easy to spot unintended use during code reviews (or using linters).

                                • techn00

                                  today at 5:06 PM

                                  average python script writer

                              • corwinxpro

                                today at 2:57 PM

                                The main problem with such approach is that `class _RealShipOpts:` is very ugly to write unit tests for. You need to import a private entity in tests. I would slightly change the presented approach, and move the "public" `ShippingOptions`, `shipFast`, etc., into a new module that is a public API, for my users to use something like `from my_lib.shipping.api import ShippingOptions`.

                                That way, I can use "normal" naming in `class RealShipOpts:...`, and be explicit that it's not really public for the end users (they should use the `.api` module instead).

                                • tcdent

                                  today at 3:56 PM

                                  I'm sorry but if you write Python functions/methods in camel case I can't take you seriously.