In this world of data explosion, every company working on consolidate data into common data format. Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or programming language. Parquet is built from the ground up with complex nested data structures in […]
Pandas is python Library will be used for reading/writing large tabular dataset. Perform arithmetic operations on number data and manipulate textual data. Pandas’s Dataframes are highly used with pytorch environment. Pandas Installation Pandas can be installed using Anaconda or python virtual environment use following commands for different environment – For Anacondas conda install pandas For […]
NumPy is package for scientific computing. It provides library for multidimensional array. This library can perform many mathematical operation on large set of dataset. Also helpful for sorting large dataset and performing IO operations. This python library can be using for random simulation data generator. Main NumPy object is “ndarray”. This object can be single […]
An old colleague of mine reached out to me for creating a random string within cloud formation. If one has not used it in past it can get tricky. I wish amazon created a function for the same but then how would I have showcased my love to SERVERLESS with this blog. I will be […]
Something went wrong. Please refresh the page and/or try again.
Get new content delivered directly to your inbox.