Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "image processing"
-
AI developers be likejoke/meme ml face recognition ai to overtake humanity soon ai recognition image processing image recognition9
-
/*
It's a pretty long rant. Hope you didn't get bored :P
*/
So I have this friend of mine who has learnt Python at good level (that's what he says) and is with me in all classes in college. I have worked with C, C++, C# and Java only and hated Python when it was taught (wk44).
So the following happened in the last 2 weeks:
Once he wrote a Python function in terminal just returning a hard coded string (lame right) and will show me how cool is it and that it is sooo much easier.
Whenever we do a mini project together he will force that we use Python. Even in Image processing when everyone is ready to work on Matlab, he insists that Python would be a better option.
We asked that this XYZ is very easy to implement on Matlab.
We then had to listen about the large and great community of Python and that it has Libraries for everything and that it is the greatest programming language ever.
One day he saw my C# project for DFA and NFA simulation which was the greatest project I have "completed" myself, and went like "Hmph, if I was you, I would use python and make a more "professional" code" (then went on arguing as always)
This happened today in Networking lab-
(Sockets was taught and we are expected to learn its programming aspects)
All students: Open linuxhowtos.org and start reading on socket programming
He : Opens some websites and downloads books on Networking with Python or someting
Now while I am reading the documentation of sockets and bind, he opens spider IDE, copy-paste the code in the book and start bugging ME that he is getting all these errors like literally showing me those errors and whining about all those problems.
Me: We are supposed to learn this in C. Here take a look at this link.
HE: No I'll use Python cuz it is better than your C. It has libraries for everything and is much easier.
Me: Alright whatever I am fed up, do whatever you want11 -
TLDR;
Wrote a slick scheduling and communication system allowing me to assign photography resources based on time and location.
I'll tell you a little secret ... I'm not actually a dev. I'm a photographer, pretending to be a dev.
Or ... perhaps it's the other way around? (I spend most of my time writing code these days, but only for me - I write the software I use to run my business).
I own a photography studio - we specialize in youth volleyball photography (mostly 12-18 year old girls with a bit of high school, college and semi-pro thrown in for good measure - it's a hugely popular sport) and travel all over the US (and sometimes Europe) photographing.
As a point of scale, this year we photographed a tournament in Denver that featured 100 volleyball courts (in one room!), playing at the same time.
I'm based in California and fly a crew of part-time staff around to these events, but my father and I drive our booth equipment wherever it needs to go. We usually setup a 30'x90' booth with local servers, download/processing/cashier computers and 45 laptops for viewing/ordering photographs. Not to mention 16' drape and banners, tons of samples, 55' TVs, etc. It's quite the production.
We photograph by paid signup only - when there are upwards of 800 teams/9,600 athletes per weekend playing, and you only have four trained photographers, you've got to manage your resources!
This of course means you have to have a system for taking sign those sign ups, assigning teams to photographers and doing so in the most efficient manner possible based on who is available when the team is playing. (You can waste an awful lot of time walking from one court to another in a large convention center - especially if you have to navigate through large crowds - not to mention exhausting yourself).
So this year I finally added a feature I've wanted for quite some time - an interactive court map. I can take an image of the court layout from the tournament and create an HTML version in our software. As I mouse over requests in one window, the corresponding court is highlighted on the map in another browser window. Each photographer has a color associated with them. When I assign requests to a photographer, the court is color coded with the color of the photographer. This allows me to group assignments to minimize photographer walk time and keep them in a specific area. It's also very easy to look at the map and see unassigned requests and look to see what photographer is nearby.
This year I also integrated with Twilio and setup a simple set of text shortcuts that photographers can use to let our booth staff know where they are, if they have memory cards that need picking up, if they need water/coffee/snack, etc. They can also move assignments on their schedule or send and SOS for help if it looks like they aren't going to be able to photograph a team.
Kind of a CLI via the phone. :)
The additions have turned out to be really useful and has made scheduling and managing the photographers much easier that it was in the past.18 -
Finally got myself a Lytro Illum!
I,v been wanting to buy one since it came out but the company who made it closed down in 2015..
Thoose fuckers just thrown everything in the trash and set it on fire, software, firmware, mobile app etc.. no open source, no archives, your expensive camera is now a paper weight! You’r welcome!
So i got myself a new hobby, started reverse-engineering the fuck out of it, luckily it’s based on android (api17), i have adb and it’s running a hidden DHCP server too so it’s coming along nicely :D
I’m planning to make a camera control mobile app for it and maybe some faster image processing, wifi sharing etc..
I love beeing in home office :D19 -
On the top of a mountain, while skiing, -6 °C, no gloves, on my phone.
I use a live wallpaper I made with Processing and it uses gps location and forecast datas to change the background image according to the environment and climate. It sucks and drains battery like a bitch, and as soon as I got the top of the mountain it fucked up everything, home screen froze and camera wouldn't open.
So guess what, it was debug time. Hands dead cold and APDE with no autocomplete on a smartphone keyboard. The agony.
My gf yelled at me and after 10 minutes I switched to a static wallpaper, uninstalled that one and never touched it again since then2 -
So I'm looking to buy a drone for my internship company to find people during floods. And damn these companies suck balls.
Closed source.
You want to use API for onboard image processing?
Buy a €3500 drone
Add €1100 processor stuff
Add €850 camera
ugh.16 -
'Sup mates.
First rant...
So Here's a story of how I severely messed up my mental health trying to fit in university.
But the bonus: Found my passion.
Her we go,
Went to university thinking it'll be awesome to learn new stuff.
1st sem was pure shock - Programming was taught at the speed of V2 rockets.
Everything was centred around marks.
Wanted to get a good run in 2nd sem, started to learn Vector design, but RIP- Hospitalized for Staph infection, missed the whole sem and was in recovery for 3 months.
So asked uni for financial assistance as I had to re-register the courses the next semester. They flat out refused, not even in this serious of a case.
So, time to register courses for third semester, turns out most of the 2nd year courses are full, I had to take 3rd year courses like:
Social and Informational Networks
Human Computer Interaction
Image processing
And
Parallel and Distributed Computing (They had no prerequisites listed, for the cucks they are: BIG MISTAKE)
Turns out the first day of classes that I attend, the Image proc. teacher tells me that it's gonna be difficult for 2nd years so I drop it, as the PDC prof. also seconds that advice.
Time travel 2 months in: The PDC prof is a bitch, doesn't upload any notes at all and teaches like she's on Velocity-9 while treating this subject like a competition on who learns the most rather than helping everyone understand.
Doesn't let students talk to each other in lab even if one wants to clear their friend's doubt, "Do it on your own!" What the actual fuck?
Time for term end exams and project submission: Me and 3 seniors implement a Distributed File System in python and show it to her, she looks satisfied.
Project Results: Everyone else got 95/100
I got 76.
She's so prejudiced that she thinks that 2nd years must have been freeloaders while I put my ass on turbo for the whole sem, learning to code while tackling advanced concepts to the point that I hated to code.
I passed the course with a D grade.
People with zero consideration for others get absolutely zero respect from me.
Well it's safe to say that I went Nuclear(heh.. pun..) at this point, Mentally I was in such a bad place that I broke down.... Went into depression but didn't realise it.
But,
I met a senior in my HCI class that I did a project with, after which I discovered we had lots of similar interests.
We became good friends and started collaborating on design projects and video game prototyping.
Enter the 4th sem and holy mother of God did I got some bad bad profs....
Then it hit me
I have been here for two years, put myself through the meat grinder and tore my soul into shreds.
This Is Not Me
This Wont Be The End Of Me
I called up my sister in London and just vented all my emotions in front of her.
Relief.
Been a long time since I felt that.
I decided to go for what I truly feel passionate about: Game Design
So I am now trying to apply for Universities which have specialised courses for game design.
I've got my groove again, learnt to live again.
Learning C# now.
:)
It's been a long hello, and If you've reached till here somehow, then damn, you the MVP.
Peace.9 -
Fuck Apple sideways! (Wait for it, the original part begins)
HOW can a company that keeps on boasting about UX, to the point that last WDC they took more time speaking about a fucking kaleidoscope animation for their watch then about the whole changes to their MacBook's hardware, even consider putting that turd of a remote on market.
If you have an Apple Store in reach, I invite you to get up, flush, and go there. Try that atrocity that is the AppleTV, and laugh at how incredibly bad it is. Then do your best to refrain from burning the place down in your inevitable rage, that might cause you legal issues.
The previous AppleTV was fine, it was practically useless, but usable. But for some reason they decided a touchpad with low precision and no gestures was a good idea to put on a remote to a thing that will, due to its target audience, most likely be plugged into a fancy TV that has a 300ms lag due to fancy image processing. That thing gives no feedback, appart from visual, has so little precision that in a movie, the smallest step I've achieved was 5 fucking minutes, while being really fucking careful not to spoil the movie.7 -
Applies for Android Internship
Supervisor: Work with this Image Processing Library to "RECOGNIZE" objects from the phones camera.
Me: Wuuuuuh....?
Supervisor: Also it should be in real time and can't use internet.
Me: But that's impossible....
Supervisor: Align your goals with the company's goal. Nothing is impossible......(gets all motivational)
Me: 😩🔫15 -
A couple days ago, I went through the most embarrassing interview ever. It was a startup into both hardware and software merged over image processing. I really wanted it. Really really did. It was telephonic, and involved a little bit coding over docs. In the one hour we talked over the phone, he asked me about 30 questions. I hadn't even heard of the words he said! Ive never delved into compilers, lower level things, and memory management. I could answer about 5 questions- including the tell me about yourself question.
So thats about 25 ways I came up with of saying "I don't know" in a span of 60 minutes.3 -
Second coolest project that I have worked was optimizing an algorithm that used images to speedup by 2000x by parallel processing. While the cpu took 7.34 seconds to process just one image and source, the parallel embedded code did it just in 0.0034 :) The coolest one is still in progress where I am building an Ai for my mac control ;)
-
I wrote a Blender plugin that uses vector math, matrices, calculus, trigonometry, and likely other types of math. There's recursion, filesystem access, image processing, interface logic, and on and on.
And worst of all - other people are expected to use it, so there's added pressure to do a good job.
Oh, the hours I spent trying to figure out why the imported geometry looked like an exploded mess. Fumbling around with mathematics I didn't fully understand was exhausting. Finding help was impossible at times because I didn't have the vocabulary to even describe the problems I was having. And getting it to complete an import before the heat death of the universe was not easy.
Every time I made progress and thought I was done, I would discover a bug that other importers didn't have, leaving me to sift through languages that definitely aren't Python to see if I could reverse engineer the logic they used.
I almost gave up a few times, but didn't.
Now I have something that, while not used by many people, works very well, is very efficient, and doubles as a palette cleanser when I need to do something for fun or for a challenge. Plus I learned a lot along the way.4 -
I spent 2 hours last night trying to figure out how to rotate an image in Java without clipping it. I was all in with a pencil and paper sketching everything out to make sure my math was right.
Turns out I was calculating the new image size correctly, but I used to wrong variables to define the new dimensions...
Sigh. -
Wow the security by captcha!
Guess what? IIT Kharagpur is considered one of the best institute in India to study Computer Science and its major in research include image processing4 -
Really loving Processing and HYPE.
Created this Image after a day's work. Here is 10x10 pixel block is an image of Mohanlal (An awesome actor from kerala).
I took the average color of each block and then matched it with the most dominant color of the sample images. And then drew the sample image onto that box.
Thanks Daniel Shiffman for the inspiration.5 -
ARE YOU READY FOR WORKPLACE BRAIN SCANNING?
Extracting and using brain data will make workers happier and more productive, backers say
https://spectrum.ieee.org/neurotech...
"What takes much more time are the cognitive and motor processes that occur after the decision making—planning a response (such as saying something or pushing a button) and then executing that response. If you can skip these planning and execution phases and instead use EEG to directly access the output of the brain’s visual processing and decision-making systems, you can perform image-recognition tasks far faster. The user no longer has to actively think: For an expert, just that fleeting first impression is enough for their brain to make an accurate determination of what’s in the image."12 -
Holy duck, I lost two days on a convolutional autoencoder splitted in two separate neural networks to encode and decode separately, it reconstruction had some strange behaviours. I was giving as input an image and then saving the encoded compressed representation in a new image, in this way I could decode it with the decoder whenever I want saving space.
How much retarded am I?
The internal layer's weights hadn't constraints so in learning phase the convolutional filters can contain any number, positive > 255 or even negative and I cannot save it in a new image as they are so they were clipped automatically between 0 and 255 with an huge information loss.
It's so frustrating when you rewrite the code in any possible way, you obtain the same wrong result and then you realize that was a borderline behaviour of a third part library.undefined convolution dimensionality reduction rbg autoencoder machine learning 255 neural networks image processing1 -
VR/AR research.
I used to work as a photographer then got more interested in image processing and that got me into programming.
I somehow just ended up in my current position which is pretty much my dream job. I don't know if I could work as a "normal" programmer. Research projects tend to be extremely hectic and the goal is not to produce a perfect piece of software but to make prototypes to prove a certain concept might work. It is not possible to focus only on single technology and sometimes the technology is not mature at all.
All this means that sometimes this prototype might be a spaghetti code nightmare which works as long as you don't touch anything. But when you get follow up projects you are able to refine the concept and eventually have quite tidy code base.
Currently I'm making projects with Hololens and luckily I have had time to clean up some components from previous projects. It feels quite nice to have working technology and lots of ready made building blocks. I can finally make stable prototypes quite rapidly.
I'll enjoy this situation until some new crappy world changing technology comes along...3 -
Back with more features now!
Cuz I don't have anything to do at work
This image is composed of screenshots from season three3 -
How often do we come across IT managers who don't plan their work properly?
I teach software development and programming at a vocational school. Our IT manager said that we got a certain budget influx and that he can procure new computers for our teaching facilities. I happily agreed and hinted that i would really like some new hardware with proper graphics cards so i could do a few small projects with Unreal engine, Unity3d or use adobe products without hardware lag. The new computers arrived about a week ago and then the "fun" started.
He had ordered some PCs with proper graphics cards and processing power and talked about putting them to up in my classroom, so wheres the "fun" i meantioned? He only ordered half a classroom worth of them - i guess the budget didn't allow for more. A week later i was supposed to move to a new room and was waiting for my new computers to be installed and yet the IT manager said that my computers would be moved along with me. I was appalled - what had happened to the new PCs he promised?
Turns out he had put em up in another building without notice, a teacher there wanted to do an extracurricular movie making activity (that included a bit of video editing at some point). That classroom is always in use so me getting more than 1-2 hours a week in there is nigh impossible.
In the end i got no new computers, hardware or software.... he didnt even bother to switch out the 2 "temporary" laptops i had in my classroom since 2 years ago due to a small shortage back then and even these have an old image that didnt include a third of the software i normally use.
PS. He had about another 2-3 classrooms worth of new PCs but those were promised to the other IT teachers back then....2 -
Learning Image Processing,Deep Learning,Machine Learning,Data modeling,mining and etc related to and also work on them are so much easier than installing requiremnts, packages and tools related to them!2
-
OpenCV is like an USB port: You never have the right input format and it only works after converting the damn thing multiple times!
-
My status:
Graduate student studying Computer Science at the University of New York at Buffalo.
4 Computer Vision and Image Processing Projects
3 Distributed Systems projects (Android apps).
Red Hat Certifications.
Applied to 135 companies for an internship program.
Here are the replies I have gotten so far.
" We have analyzed your resume and we think you'll be a great choice for this position at our company."
What position?
MARKETING INTERN.
FML!1 -
Father bought a computer for the family in 2011. A HCl Dual Core Pentium 4 machine with 21" TFT screen. I was allowed to use it only under someone's presence for at-most 20 minutes each day for the next 6-9 months.
After that we got a network card (plug-and-play internet dongle) for the internet services. That's when I entered the world of internet and made a Facebook account. I was 12 then ¯\_(ツ)_/¯.
After two years or so, we're playing games on it, watching movies and using MS Word for school related stuff. Then my brother entered college, and used it for stuff like coding and image processing on Matlab, while I watched him doing so and getting yelled at for doing what I liked to do, at the same time.
After 5 years or so, I got a personal laptop with decent configuration for college work. The old computer still worked like charm.
Now, the old monk is at rest with old memories, unknown files and lot of bollywood songs.1 -
I hate looking for sources to cite in my thesis.
Either it's a book for 100€ or a paper thst I need to subscribe to IEEE xplore to..
And most of the stuff on Springer Link (which I get free access through uni) is behind a pay wall anyway...
On that note, does anyone happen to know decent sources on basic signal processing and image processing?
I.e. DCT, DFT transformations and so on19 -
So I wrote some code to sort images in folders based on dates.
Like 2024>06>12.
I thought thats a good little script for GPT to help me out as I wanted to write it in rust.
Everything was fine and after processing all images and videos for 24 hours I was happy.
My test runs worked well.
Two days passed and I realize something.
Some images are not put in date folders. Why? Well I guess a little bug.
Starting to dive deep and checking if other images are in folders.
I see that I have images in folders since 2015 for most months and dates.
But why are some not put in exact day folders.
So another deep dive and I find out that the creation date is different to the folder the images are in.
Often its off by months.
Turns out I forgot to double check how the code generated by GPT maps the time between image creation date and unix epoch to a date folder.
It was just doing a division by an approximation of seconds that a month has, a year has, and a day has.
This caused things to be completely off the further away we go from 1970.
Lucky me that I did not mess up the creation dates :)
Looks like another 24 hours run5 -
Had to port a python code some other guy wrote using opencv for some image processing stuff to Java. I thought "how tough can it be? Let's just try it out on python first just to verify the results", only to waste an entire fucking day trying to install opencv first and make it work and to add to it the crappy opencv documentation were no help. In the end I had to just give up on this shit and decide to just do the Java implementation which I later verified from the python guy's results.
-
Just got accepted as a Tutor. I have to teach PhD students in medical field SciKit package for image processing. Been coding in Python using pandas and numpy for years, but I know jack shit on SciKit.
I applied just for fun and got the position. Now I am fucking terrified.
Meanwhile I rejected a Teaching Assistant position because of this one.6 -
Pre 2k i startet making levels in UnrealEd, which changed the way i saw the world. Suddenly i could look at things, buildings, architecture for long times, just thinking how i would build something like that from simple polygons.
As a coder i started to analyze the way processes are controlled in logic.
And now after some years in automation technology and image processing, other things come to my mind like "give me 50k€ in hardware and some weeks and i could replace that persons job with a system". -
a client reached out today who wanted a website which had a dl model for image processing
website -> take image -> pass through model -> based on result sort images and display in grid along with other features
budget? $856 -
A colleague of mine:
God damn, my application is racist.
Me: why?
Answer: It doesn't see enough white. -
My work product: Or why I learned to get twitchy around Java...
I maintain a Java based test system, that tests a raster image processor. The client is a Java swing project that contains CORBA bindings to the internal API of the raster image processor. It also has custom written UI elements and duplicated functionality that became available in later versions of Java, but because some of the third party tools we use don't work with later versions of Java for some reason, it's not possible to upgrade Java to gain things as simple as recursive directory deletion, yes the version of Java we have to use does not support something as simple as that and custom code had to be written to support it.
Because of the requirement to build the API bindings along with the client the whole application must be built with the raster image processor build chain, which is a heavily customised jam build system. So an ant task calls out to execute a jam task and jam does about 90% of the heavy lifting.
In addition to the Java code there's code for interpreting PostScript files, as these can be used to alter the behaviour of the raster image processor during testing.
As if that weren't enough, there's a beanshell interface to allow users to script the test system, but none of the users know Java well enough to feel confident writing interpreted Java scripts (and that's too close to JavaScript for my comfort). I once tried swapping this out for the Rhino JavaScript interpreter and got all the verbal support in the world but no developer time to design an API that'd work for all the departments.
The server isn't much better though. It's a tomcat based application that was written by someone who had never built a tomcat application before, or any web application for that matter and uses raw SQL strings instead of an orm, it doesn't use MVC in any way, and insane amount of functionality is dumped into the jsp files.
It too interacts with a raster image processor to create difference masks of the output, running PostScript as needed. It spawns off multiple threads and can spend days processing hundreds of gigabytes of image output (depending on the size of the tests).
We're stuck on Tomcat seven because we can't upgrade beyond Java 6, which brings a whole manner of security issues, but that eager little Java updated will break the tool chain if it gets its way.
Between these two components we have the Java RMI server (sometimes) working to help generate image data on the client side before all images are pulled across a UNC network path onto the server that processes test jobs (in PDF format), by reading into the xref table of said PDF, finding the embedded image data (for our server consumed test files are just flate encoded TIFF files wrapped around just enough PDF to make them valid) and uses a tool to create a difference mask of two images.
This tool is very error prone, it can't difference images of different sizes, colour spaces, orientations or pixel depths, but it's the best we have.
The tool is installed in both the client and server if the client can generate images it'll query from the server which ones it needs to and if it can't the server will use the tool itself.
Our shells have custom profiles for linking to a whole manner of third party tools and libraries, including a link to visual studio 2005 (more indirectly related build dependencies), the whole profile has to ensure that absolutely no operating system pollution gets into the shell, most of our apps are installed in our home directories and we have to ensure our paths are correct for every single application we add.
And... Fucking and!
Most of the tools are stored as source bundles in a version control system... Not got or mercurial, not perforce or svn, not even CVS... They use a custom built version control system that is built on top of RCS, it keeps a central database of locked files (using soft and hard locks along with write protecting the files in the file system) to ensure users can't get merge conflicts by preventing other users from writing to the files at all.
Branching is heavy weight and can take the best part of a day to create a new branch and populate the history.
Gathering the tools alone to build the Dev environment to build my project takes the best part of a week.
What should be a joy come hardware refresh year becomes a curse ("Well fuck, now I loose a week spending it setting up the Dev environment on ANOTHER machine").
Needless to say, I enjoy NOT working with Java. A lot of this isn't Javas fault, but there's a lot of things that Java (specifically the Java 6 version we're stuck on) does not make easy.
This is why I prefer to build my web apps in python or node, hell, I'd even take Lua... Just... Compiling web pages into executable Java classes, why? I mean I understand the implementation of how this happens, but why did my predecessor have to choose this? Why?2 -
The hype of Artificial Intelligence and Neutral Net gets me sick by the day.
We all know that the potential power of AI’s give stock prices a bump and bolster investor confidence. But too many companies are reluctant to address its very real limits. It has evidently become a taboo to discuss AI’s shortcomings and the limitations of machine learning, neural nets, and deep learning. However, if we want to strategically deploy these technologies in enterprises, we really need to talk about its weaknesses.
AI lacks common sense. AI may be able to recognize that within a photo, there’s a man on a horse. But it probably won’t appreciate that the figures are actually a bronze sculpture of a man on a horse, not an actual man on an actual horse.
Let's consider the lesson offered by Margaret Mitchell, a research scientist at Google. Mitchell helps develop computers that can communicate about what they see and understand. As she feeds images and data to AIs, she asks them questions about what they “see.” In one case, Mitchell fed an AI lots of input about fun things and activities. When Mitchell showed the AI an image of a koala bear, it said, “Cute creature!” But when she showed the AI a picture of a house violently burning down, the AI exclaimed, “That’s awesome!”
The AI selected this response due to the orange and red colors it scanned in the photo; these fiery tones were frequently associated with positive responses in the AI’s input data set. It’s stories like these that demonstrate AI’s inevitable gaps, blind spots, and complete lack of common sense.
AI is data-hungry and brittle. Neural nets require far too much data to match human intellects. In most cases, they require thousands or millions of examples to learn from. Worse still, each time you need to recognize a new type of item, you have to start from scratch.
Algorithmic problem-solving is also severely hampered by the quality of data it’s fed. If an AI hasn’t been explicitly told how to answer a question, it can’t reason it out. It cannot respond to an unexpected change if it hasn’t been programmed to anticipate it.
Today’s business world is filled with disruptions and events—from physical to economic to political—and these disruptions require interpretation and flexibility. Algorithms alone cannot handle that.
"AI lacks intuition". Humans use intuition to navigate the physical world. When you pivot and swing to hit a tennis ball or step off a sidewalk to cross the street, you do so without a thought—things that would require a robot so much processing power that it’s almost inconceivable that we would engineer them.
Algorithms get trapped in local optima. When assigned a task, a computer program may find solutions that are close by in the search process—known as the local optimum—but fail to find the best of all possible solutions. Finding the best global solution would require understanding context and changing context, or thinking creatively about the problem and potential solutions. Humans can do that. They can connect seemingly disparate concepts and come up with out-of-the-box thinking that solves problems in novel ways. AI cannot.
"AI can’t explain itself". AI may come up with the right answers, but even researchers who train AI systems often do not understand how an algorithm reached a specific conclusion. This is very problematic when AI is used in the context of medical diagnoses, for example, or in any environment where decisions have non-trivial consequences. What the algorithm has “learned” remains a mystery to everyone. Even if the AI is right, people will not trust its analytical output.
Artificial Intelligence offers tremendous opportunities and capabilities but it can’t see the world as we humans do. All we need do is work on its weaknesses and have them sorted out rather than have it overly hyped with make-believes and ignore its limitations in plain sight.
Ref: https://thriveglobal.com/stories/...6 -
Designed a person detection and tracking algorithm based on RCNN and lukas-kanade object tracking algorithm in openCV python.
Need help in cases of occlusion any suggestions?4 -
I use to develop desktop programs in C++ with algos related with image processing and computer vision. However, new projects appear and one of them was for web using Drupal. It was my first experience with web and I am still having nightmares... It is the worst thing you can do. Continue a big project without the understanding of technology nor the framework... Now I am more experienced and I prefer stacks like MERN. Easy the debugging in web i so crucial... Maybe, I would have to swtich to webassembly.6
-
Make your code available for your team members, please.
So we're working on this robotics project using ROS, a framework that enables multiple nodes in a network exchange their functionality among each other through tcp connections. Each node can be implemented and executed on your own machine, and tested with dummy inputs, but in collaboration they make a robot do fancy stuff.
The knowledgebase needs data from the image processing unit, providing this data to others with semantic context to high level planning, which uses this semantic data for decision making and calling the robot manipulation node with meaningful input, to navigate the robot's components in the environment. We use a dedicated machine, which pulls the corresponding repositories and is always kept configured correctly, to run each node, such that everybody has access to each other's work when needed.
So far so good. We tried to convince the manipulation guy (let's call him John) to run his code on our central machine, not a week, but since the first day, 5 months ago. Our cluster classification has been unavailable for 2 months, but my collegue fixed that. We still can't run the whole project without John's computer. If his machine blows up we're fucked.
Each milestone feels like a big-bang-test, fixing issues in interfaces last-minute. We see the whole demo just moments before our supervisors arrive at the door.
I just hope he doesn't get hit by a truck.2 -
My company design floor plan and some photoshop work for clients.
One project was to resize the image to certain width and height and place it in the center of the photo with padding 40px around.
I wrote an extended script of Adobe to help the design department and process thousand of images within an hour.
My Boss was so impressed and have a meeting with me. He said: "You need to lead IT department and create a system that can detect the client's requirement and complete the drawing with Adobe Illustrator automatically".
Me: Thinking (Meh, I have no knowledge of Image Processing with my poor Mathematics, where can I die with his requirements?) -
Some friends of mine were working on doing neural network image processing and wanted to build a social network for it. I got to play with graph databases, mobile app development, and neural nets. Unfortunately, project never took off, but it was fun nevertheless.
-
I have one question to everyone:
I am basically a full stack developer who works with cloud technologies for platform development. For past 5-6 months I am working on product which uses machine learning algorithms to generate metadata from video.
One of the algorithm uses tensorflow to predict locale from an image. It takes an image of size ~500 kb and takes around 15 sec to predict the 5 possible locale from a pre-trained model. Now when I send more than 100 r ki requests to the code concurrently, it stops working and tensor flow throws some error. I am using a 32 core vcpu with 120 GB ram. When I ask the decision scientists from my team they say that the processing is high. A lot of calculation is happening behind the scene. It requires GPU.
As far as I understand, GPU make sense while training but while prediction or testing I do not think we will need such heavy infra. Please help me understand if I am wrong.
PS : all the decision scientists in the team basically dumb fucks, and they always have one answer use GPU.8 -
We are using a camera in a practical course of image processing on the college. That camera has it's own library to communicate with it so i tried to download the library so i can prepare for the course
It took like 10 minutes to find out that the library is only given to buyers.
In the package with the camera is a password which you need to download the library. Even the documentation is behind that stupid "pay"-wall.
Yeah, your library can only talk to your cameras so i need one of the cameras to use it so why is the library and the complete documentation of it not public?!
Eventually i copied all of it from the college computers.
Maybe i'm just too spoiled with the broad availability of OSS ... -
Some computer scientists believe that OpenCV has between 3 and 8 different documentation websites, but we may never know for sure.
-
I'm currently a java developer. I've dabbled in python too. Mostly worked on API development and some data processing. I want to learn something new, that'll keep me engaged. It can be something within java (like image processing or NLP) or some other language (Go, scala, js). What do you all suggest?6
-
Hi guys,
can somebody recommend me a digital camera, which can be controlled programmatically by computer aka. Rasberry Pi?
I want to try some image processing algorithms and time lapse on live video. The price should be around 200-300$ and it is really important that I can program it.3 -
Anyone tried converting speech waveforms to some type of image and then using those as training data for a stable diffusion model?
Hypothetically it should generate "ultrarealistic" waveforms for phonemes, for any given style of voice. The training labels are naturally the words or phonemes themselves, in text format (well, embedding vectors fwiw)
After that it's a matter of testing text-to-image, which should generate the relevant phonemes as images of waveforms (or your given visual representation, however you choose to pack it)
I would have tried this myself but I only have 3gb vram.
Even rudimentary voice generation that produces recognizable words from text input, would be interesting to see implemented and maybe a first for SD.
In other news:
Implementing SQL for an identity explorer. Basically the system generates sets of values for given known identities, and stores the formulas as strings, along with the values.
For any given value test set we can then cross reference to look up equivalent identities. And then we can test if these same identities hold for other test sets of actual variable values. If not, the identity string cam be removed, or gophered elsewhere in the database for further exploration and experimentation.
I'm hoping by doing this, I can somewhat automate the process of finding identities, instead of relying on logs and using the OS built-in text search for test value (which I can then look up in the files that show up, and cross reference the logged equations that produced those values), which I use to find new identities.
I was even considering processing the logs of equations and identities as some form of training data perhaps for a ML system that generates plausible new identities but that's a little outside my reach I think.
Finally, now that I know the new modular function converts semiprimes into numbers with larger factor trees, I'm thinking of writing a visual browser that maps the connections from factor tree to factor tree, making them expandable and collapsible, andallowong adjusting the formula and regenerating trees on the fly.7 -
OK, so I've been working on processing a Japanese dictionary file and things are going smoothly for the most part. Out of ~185,000 entries, I've got 35 that are still causing problems.
The error I'm getting is "Incorrect string value '\xF0\xA4\xAD\xAF' for column...". I've checked all of my encoding and collation settings, and I'm pretty sure I've got it set to properly implement all of Unicode (as well as it does, anyway), as shown in the image attached. My suspicion is the problem characters are likely among the JIS X 0213 character set; in either case we're clearly dealing with a 4-byte character encoding issue here.
If needed I can attach a flag in the database and base64 encode these particular entries so the data isn't lost, but I'd like to just get it to handle the data properly in the first place if possible.
Anyone have any ideas on other items I can check to resolve the error?10 -
Julia sucks!
It has similar syntax to Python and it's messing with my Python's knowledge.
Thanks to my Image processing subject's professor who preferred Julia over Python, because it's faster! and then he uses a package called Pluto (similar to Jupyter) which makes running Julia code super frustrating.1 -
A question about image processing and machine learning.
I post random images of the Earth to Twitter. I would like the bot to detect if the image is bad one. How can I do this? Here is an example of bad image.9 -
"Lenna" කෙල්ල නම් අහංකාරයි
Image processing නම් අලාංකාරයි
ඇස් දෙක නම් නිලාංකාරයි
Code කරන අපිට දෙයියන්ගේම පිහිටයි7 -
Can anyone suggest me a github link for GAN in deep learning for generation of new images which is mostly used by the researchers
-
ComPDFKit Solutions
For text extraction technology, ComPDFKit offers the following two solutions that effectively address text extraction for all types of PDF files. For documents containing only text information, our non-intelligent solution can suffice. But for more complex documents and image-based ones, ComPDFKit Document AI offers higher accuracy in text extraction. To learn about the accuracy of ComPDFKit's information extraction, see this article.
1. Algorithm: X-Y Cut Recursion Projection Method
The X-Y Cut Recursion Projection Method is a top-down page segmentation technique that decomposes a document image into rectangular blocks. It employs a recursive approach by projecting along both the X and Y axes to segment a PDF into independent rectangles, facilitating the extraction of textual components. ComPDFKit utilizes this method for efficient text separation and structural organization, including rows, paragraphs, and columns, to retrieve characters, words, lines, and paragraphs from the document.
The advantage of the X-Y Cut Recursion Projection Method is its speed, making it suitable for simple, structured, non-image-based PDF documents. However, for complex, unstructured PDFs, there might be recognition errors or omissions.
2. ComPDFKit Document AI
Document AI is an intelligent text extraction solution supporting all types of PDF files, including image-based. It uses artificial intelligence-based methods for document recognition and analysis to extract textual information from PDF documents (as well as images, tables, etc.).
- PDF Recognition and Analysis: This involves using deep learning models to recognize and analyze PDF documents, extracting elements like text, images, and tables while retaining their position, size, style, etc. ComPDFKit owns well-trained AI models to accomplish this process.
- Image Pre-processing: This process involves improving the quality and clarity of low-quality images in PDF documents, enhancing subsequent recognition and analysis. ComPDFKit employs multiple image processing techniques, such as image sharpening enhancement, noise reduction, document trimming and straightening, and stamp detection.
- OCR (Optical Character Recognition): OCR technology has a wide range of application scenarios such as license plate recognition, bank card information extraction, identity document (ID card) information recognition, train ticket information detection, etc. ComPDFKit supports recognition in dozens of languages. With extensively trained model zoo, it can accurately detect and recognize text in documents and analyze document structure.1 -
//not a rant, just a question
Yeah I know SO is the place to ask such stuff, but I still wanna ask it here.
I have started with OpenCV for image processing. The sad thing is it is available for python only. Is there a PHP alternative? The best I have found is ImageMagick which doesn't come close to OpenCV.