this post was submitted on 22 Aug 2023
184 points (98.4% liked)
Python
6378 readers
131 users here now
Welcome to the Python community on the programming.dev Lemmy instance!
π Events
Past
November 2023
- PyCon Ireland 2023, 11-12th
- PyData Tel Aviv 2023 14th
October 2023
- PyConES Canarias 2023, 6-8th
- DjangoCon US 2023, 16-20th (!django π¬)
July 2023
- PyDelhi Meetup, 2nd
- PyCon Israel, 4-5th
- DFW Pythoneers, 6th
- Django Girls Abraka, 6-7th
- SciPy 2023 10-16th, Austin
- IndyPy, 11th
- Leipzig Python User Group, 11th
- Austin Python, 12th
- EuroPython 2023, 17-23rd
- Austin Python: Evening of Coding, 18th
- PyHEP.dev 2023 - "Python in HEP" Developer's Workshop, 25th
August 2023
- PyLadies Dublin, 15th
- EuroSciPy 2023, 14-18th
September 2023
- PyData Amsterdam, 14-16th
- PyCon UK, 22nd - 25th
π Python project:
- Python
- Documentation
- News & Blog
- Python Planet blog aggregator
π Python Community:
- #python IRC for general questions
- #python-dev IRC for CPython developers
- PySlackers Slack channel
- Python Discord server
- Python Weekly newsletters
- Mailing lists
- Forum
β¨ Python Ecosystem:
π Fediverse
Communities
- #python on Mastodon
- c/django on programming.dev
- c/pythorhead on lemmy.dbzer0.com
Projects
- PythΓΆrhead: a Python library for interacting with Lemmy
- Plemmy: a Python package for accessing the Lemmy API
- pylemmy pylemmy enables simple access to Lemmy's API with Python
- mastodon.py, a Python wrapper for the Mastodon API
Feeds
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
But why? Excel is a shit way to work with big amounts of data due to it's own format's complexity and bloated software. It's welcome to implement python, but that's not what holds it down. Opening a big csv would crash it on the same machine that loads it with a python IDE in seconds. It's not made for this. It's like, nice, but the volume of information you need to make it matter would break Excel in halves.
This feels like a really dated take to me. Leaving aside whether this was true in the past, in 2023, Excel is happy to open absolutely gargantuan files, and it's quite speedy once it's done so. You can even directly tie it to a database via ODBC if you want, and that works (albeit it obviously flattens the data out in the process, so goodbye foreign keys in any real sense). It also has tons of very easy-to-use data manipulation tools (pivot tables, tables in general, data extrapolation, graphs, etc.) that end up being wonderful complements to something like Python.
Could you write a Python program that would run faster than pure Excel and do the same thing? I mean, probably (although Excel's core execution engine is honestly pretty freaking fast). But could you write it as quickly? Maybe, maybe not. And certainly someone who knows Excel well would have an easier time adding a little Python to patch up any issues than rewriting the whole thing from scratch.
tl;dr I think you're not being accurate about contemporary Excel, and I separately suspect you're not really the target audience here
Could I write a Python program that does the same thing as Excel but faster?
I don't need to. It's called pandas
I agree with all your points about Excel being capable. However, I'm struggling to think of examples where this newly announced Python integration within Excel would be helpful (with the exception of new/different visualizations) - especially for the reasons you stated about modern Excel.
Are there any use cases that you can think of where someone who knows Excel well would resort to "adding a little Python to patch up any issues"?
Maybe some sort of "team toolbox" of logical functions or something? I've seen some nightmare shops where big reports have a page of manipulations that get copied to new reports/projects every time, and each represents some sort of canned, core business logic.
I dunno. I cant imagine how the code storage will go
The code storage for Python is no different than regular Excel functions (eg -
VLOOKUP()
,SUM()
, etc.), meaning that it is stored within an Excel cell. The only differences are that Python code is run remotely vs Excel functions running locally and the location of Python's code matters vs Excel's functions are location agnostic (ie - Python code runs in cells located left-to-right, top-to-bottom but Excel's functions can dynamically determine the calculation order/location).I'm not sure that this new Python integration changes much about this use case (except for another way to accomplish the same/similar tasks).
I hear you.
I don't know how new Excel performes and I thought it's the same as ten years ago - the version I'm trapped in. With people who obsessively try to drive it to the edge where it's not responsive on average office PCs.
But if it works well with various big spreadsheets now, it's a wonder, with how many new people start to tackle programming with Python. I obiviosly won't write a script faster than normal operational speeds of software, it's just some tables ended up that big and broken I could only open them like that. But that, I guess, is exclusion?
It's just the issue of people using a microscope as a hammer when they need to break nuts.
So, assuming you're still on Office 2010, you're missing (off the top of my head, but I believe these were all Excel 2013 or later):
A2:A300
garbage where Excel would instead just have e.g.SomeTable[Heading]
. E.g., an actual formula from a sheet I currently maintain to track my team's sprints:=XLOOKUP([@Verified],SprintMeta[Start],SprintMeta[Sprint Name],"Unknown",-1)
. Python's easier to read here, but this is honestly doing a lot while being surprisingly readable (especially if you're familiar withXLOOKUP
, which is basically how you do keyed array access in Excel)You have totally legitimate gripes about Excel; I'm not denying that. But I do think that you might be pleasantly surprised on newer versions.
Sounds like this wonβt be the right tool for your use case
You are right, but it'd still be used by my company for me to cringe at that without a way to change it π
I agree! I'm not sure why you're being downvoted either.
This new integration just allows you to do data analysis and data visualization of existing data within an Excel file via Python. The output of your Python scripts is limited to the Excel file. The Python environment itself is also limited as it runs on Microsoft's platform and is controlled by Microsoft.
The (Excel) problem that people already using Python for data analysis/visualization is that they have to use Excel files. Reading/writing Excel files via Python can sometimes be tedious or limiting. Utilizing Python inside of Excel via this integration may help in some scenarios, but they won't be able to use custom libraries built internally, control the Python environment (eg - must use specific version of Python or Python library, can't utilize all Python libraries available on pip, etc.), connect to all necessary external data sources via Python, and utilize proper VCS tools like git.
The problem that people automating tasks via Python have is that there is no Python library nearly as capable of reading or manipulating Excel files as VBA is. This new Python integration does not change that.
The problem that Excel users have is that they want more advanced (or simple/easier) data analysis and data visualization capabilities. However, with Excel's dynamic array formulas, LAMBDA formula, Power Query, and Power Pivot, Excel is becoming much more capable than it ever was. If those tools cannot meet your needs, you likely need to move to something like R, Python, or some other tool. Embedding Python into Excel like this integration does still limits Python with all of Excel's current restraints (size, performance, etc.).
Because this may allow companies that are already using excel and not planning on changing to actually have a powerful programming language.